Re: Threading and Query Performance

2007-03-21 Thread Marcel Reutegger
Hi David, David Johnson wrote: Anything look strange in the stack traces from my last post on this thread? Or does this all look 'normal' apologies for the late answer. it seems this is a known issue with lucene: http://www.mail-archive.com/java-dev@lucene.apache.org/msg08398.html regards m

Re: Threading and Query Performance

2007-03-19 Thread David Johnson
Anything look strange in the stack traces from my last post on this thread? Or does this all look 'normal' -Dave On 3/14/07, David Johnson <[EMAIL PROTECTED]> wrote: I have 5-7 stack dumps from 2 different runs that I captured using jstack. Here is an interesting (in that there are 2 blocked

Re: Threading and Query Performance

2007-03-14 Thread David Johnson
I have 5-7 stack dumps from 2 different runs that I captured using jstack. Here is an interesting (in that there are 2 blocked threads) example - I have many more, and can create many more as needed - I was running a 4 thread test corresponding to the # of cores on my system. I ran another test

Re: Threading and Query Performance

2007-03-14 Thread Jukka Zitting
Hi, On 3/14/07, Marcel Reutegger <[EMAIL PROTECTED]> wrote: it's not fully synchronized, queries do run in parallel, which can be seen in Davids results. but at a lower level, probably in lucene, some part of the query execution is synchronized. Other potential issues are, as already suggested

Re: Threading and Query Performance

2007-03-14 Thread Marcel Reutegger
Hi David, I was mainly interested in the size of the repository and the structure, just like you described it. Another aspect that affects query performance is the complexity of the query. What's the structure of your queries? Are they simple? e.g.: //element(*, my:content)[EMAIL PROTECTED]

Re: Threading and Query Performance

2007-03-14 Thread Marcel Reutegger
Tobias Bocanegra wrote: well, from a first glance: the more threads you add, the faster the queries are until you reach the number of processors. if you expect linear improvement - that does not work, since the actual 'searching' is synchronized (marcel, correct me if i'm saying something wrong h

Re: Threading and Query Performance

2007-03-13 Thread David Johnson
So if I increase the # of threads to a worst case scenario - 100 threads running 1 query each - I would think I would see a worst case in synchronization and scheduling overhead - i.e., 100 threads would run significantly slower than the 25 thread run. The 100 thread run took 37070 ms to run. No

Re: Threading and Query Performance

2007-03-13 Thread Tobias Bocanegra
well, from a first glance: the more threads you add, the faster the queries are until you reach the number of processors. if you expect linear improvement - that does not work, since the actual 'searching' is synchronized (marcel, correct me if i'm saying something wrong here). when you add more a

Re: Threading and Query Performance

2007-03-13 Thread David Johnson
What kind of information concerning the repository do you need? There are approximately 120K nodes in the repository. The node hierarchy somewhat follows a /a-rootnode/large-subsection/subtypes/year/month/day/content - There are 13 large-subsections - around 5 subtypes per large-subsection - yea

Re: Threading and Query Performance

2007-03-13 Thread David Johnson
Please find the test code below - I believe it doesn't do anything too stupid :-) If anything jumps out, please let me know. queries.txt is a text file with one query per line - it is a mix of SQL and XPath queries. The query tester can take upto 2 parameters. The first is the # of threads to

Re: Threading and Query Performance

2007-03-13 Thread Marcel Reutegger
David Johnson wrote: This is related to two ongoing list threads - one on synchronization and the other on query performance. As I have mentioned in previous posts, I have been running a variety of query tests. I am using a suite of 100 queries and running them against Jackrabbit in several di

Threading and Query Performance

2007-03-12 Thread David Johnson
This is related to two ongoing list threads - one on synchronization and the other on query performance. As I have mentioned in previous posts, I have been running a variety of query tests. I am using a suite of 100 queries and running them against Jackrabbit in several different threading scena