So if I increase the # of threads to a worst case scenario - 100 threads
running 1 query each - I would think I would see a worst case in
synchronization and scheduling overhead - i.e., 100 threads would run
significantly slower than the 25 thread run.

The 100 thread run took 37070 ms to run.  Not that far off from the 25
thread run.

Only a 10% speed increase between the 1 thread and 2 thread seems off
(running on a 4 core box) - unless, as you mention, the searching is hitting
a  synchronization issue.  I am no threading expert by any means, it just
seems off.

Again, there are no writes happening to this repository during these tests -
it is a read only repository.

-Dave

On 3/13/07, Tobias Bocanegra <[EMAIL PROTECTED]> wrote:

well, from a first glance: the more threads you add, the faster the
queries are until you reach the number of processors. if you expect
linear improvement - that does not work, since the actual 'searching'
is synchronized (marcel, correct me if i'm saying something wrong
here).
when you add more ad more threads, the synchronization and scheduling
overhead gets bigger and you loose again overall speed.

what you must ask yourself: is '~350ms' per query fast enough for my
application? how do my queries look like? can i optimize on the query
or data structure?

regards, toby

On 3/12/07, David Johnson <[EMAIL PROTECTED]> wrote:
> This is related to two ongoing list threads - one on synchronization and
the
> other on query performance.
>
> As I have mentioned in previous posts, I have been running a variety of
> query tests.  I am using a suite of 100 queries and running them against
> Jackrabbit in several different threading scenarios - i.e., I change the
#
> of threads used to run sub-sets of the 100 queries.  To be clear - if I
run
> a single thread case, it will run all 100 queries, one after the
other.  If
> I run 2 threads - one thread will run 50 queries, while the other thread
> will run the other 50 queries.  In all cases, the 100 queries are the
same,
> the only thing that changes is the number of threads used to run them.
> Also, in all tests, the repository is read only - nothing is making any
> writes to the repository.
>
> Here are some results:
>
> 1 thread: 100 queries in 41139 ms
> 2 threads: 50 queries in 37828 ms, 50 queries in 38622 ms - total time
for
> all threads to complete 38960 ms
> 4 threads: 25 queries in 25895 ms, 25 queries in 28034 ms, 25 queries in
> 32335 ms, 25 queries in 32391 ms - total time 32801 ms
> 10 threads: 10 queries in 18733 ms, 10 queries in 19894 ms, ... , 10
queries
> in 33798 ms, 10 queries in 34924 ms - total time 35286 ms
> 25 threads: 4 queries in 2413 ms, 4 queries in 11725 ms, 4 queries in
18294
> ms, ... , 4 queries in 36059 ms, 4 queries in 36222 ms
>
> Some details on the box that I am running these tests on: it is a dual
Xeon
> running Linux - /proc/cpuinfo shows 4 processors, so I am assuming it is
a
> dual core.  I am running Jackrabbit 1.2.3 with the Bundle Persistence
> Manager.
>
> I am not sure what the numbers above are really saying, although they
don't
> really look right :-)  We have a multi-user use case - large web site
with
> many ongoing reads, occasional writes.  I am using the multiple threads
to
> "test" multiple users.  I am hoping that the developers with more
> understanding of the internals can explain what's going on above.
>
> I am wondering if I am hitting the synchronization issue that is being
> discussed in other posts?  Thoughts?
>
> -Dave
>


--
-----------------------------------------< [EMAIL PROTECTED] >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Reply via email to