Re: Does more shards in core improve performance?

2015-09-23 Thread Zheng Lin Edwin Yeo
I leave it to the default settings for now, which should be balancing 50-50
across both shards.

Regards,
Edwin

On 23 September 2015 at 22:49, Alessandro Benedetti <
benedetti.ale...@gmail.com> wrote:

> Using a second machines , you will dispose of fresh memory, disk and CPUs.
> So assuming you succeeded in saturating the first machine indexing power,
> of course it is normal that you improve your indexing time giving an
> additional node serving the indexing process.
> Are you balancing 50:50 or with some compositeId strategy(which anyway will
> try to balance if possible) ?
>
> Cheers
>
> 2015-09-23 15:33 GMT+01:00 Zheng Lin Edwin Yeo :
>
> > I've tried to run different shards from different machine, and there is a
> > slight improvement in the performance (about 3 mins faster for 1GB worth
> of
> > data, from 22 mins to 19 mins).
> >
> > Is this a normal scenario? Both of my machine are running on Intel i7
> core.
> >
> >
> > Regards,
> > Edwin
> >
> >
> > On 21 September 2015 at 16:24, Zheng Lin Edwin Yeo  >
> > wrote:
> >
> > > I'm not sure if that is because currently my machine is a normal PC and
> > > not a server, but my CPU specification for each of the core is Intel(R)
> > > Core(TM) i7-4910MQ CPU @ 2.90GHz.
> > >
> > > It should probably be better when the real server which has a much
> better
> > > specification comes, and I should be able to do the indexing in a
> lesser
> > > time using the knowledge that I've learnt here.
> > >
> > >
> > > Regards,
> > > Edwin
> > >
> > >
> > >
> > > On 21 September 2015 at 16:00, Toke Eskildsen 
> > > wrote:
> > >
> > >> On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote:
> > >> > I didn't find any increase in indexing throughput by adding shards
> in
> > >> the
> > >> > same machine.
> > >> >
> > >> > However, I've managed to feed the index to Solr from more than one
> > >> thread
> > >> > at a time. It can take up to 3 threads without affecting the
> indexing
> > >> > speed. Anything more than that, the CPU will hit 100%, and the
> > indexing
> > >> > speed in all the threads will be reduced.
> > >>
> > >> It is a bit surprising that the limit is 3 Threads on an 8 core
> machine,
> > >> but I am happy to hear that your findings fit the overall theory.
> > >>
> > >>
> > >> Thank you for the verification,
> > >> Toke Eskildsen, State and University Library, Denmark
> > >>
> > >>
> > >>
> > >
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card - http://about.me/alessandro_benedetti
> Blog - http://alexbenedetti.blogspot.co.uk
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>


Re: Does more shards in core improve performance?

2015-09-23 Thread Alessandro Benedetti
Using a second machines , you will dispose of fresh memory, disk and CPUs.
So assuming you succeeded in saturating the first machine indexing power,
of course it is normal that you improve your indexing time giving an
additional node serving the indexing process.
Are you balancing 50:50 or with some compositeId strategy(which anyway will
try to balance if possible) ?

Cheers

2015-09-23 15:33 GMT+01:00 Zheng Lin Edwin Yeo :

> I've tried to run different shards from different machine, and there is a
> slight improvement in the performance (about 3 mins faster for 1GB worth of
> data, from 22 mins to 19 mins).
>
> Is this a normal scenario? Both of my machine are running on Intel i7 core.
>
>
> Regards,
> Edwin
>
>
> On 21 September 2015 at 16:24, Zheng Lin Edwin Yeo 
> wrote:
>
> > I'm not sure if that is because currently my machine is a normal PC and
> > not a server, but my CPU specification for each of the core is Intel(R)
> > Core(TM) i7-4910MQ CPU @ 2.90GHz.
> >
> > It should probably be better when the real server which has a much better
> > specification comes, and I should be able to do the indexing in a lesser
> > time using the knowledge that I've learnt here.
> >
> >
> > Regards,
> > Edwin
> >
> >
> >
> > On 21 September 2015 at 16:00, Toke Eskildsen 
> > wrote:
> >
> >> On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote:
> >> > I didn't find any increase in indexing throughput by adding shards in
> >> the
> >> > same machine.
> >> >
> >> > However, I've managed to feed the index to Solr from more than one
> >> thread
> >> > at a time. It can take up to 3 threads without affecting the indexing
> >> > speed. Anything more than that, the CPU will hit 100%, and the
> indexing
> >> > speed in all the threads will be reduced.
> >>
> >> It is a bit surprising that the limit is 3 Threads on an 8 core machine,
> >> but I am happy to hear that your findings fit the overall theory.
> >>
> >>
> >> Thank you for the verification,
> >> Toke Eskildsen, State and University Library, Denmark
> >>
> >>
> >>
> >
>



-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Does more shards in core improve performance?

2015-09-23 Thread Zheng Lin Edwin Yeo
I've tried to run different shards from different machine, and there is a
slight improvement in the performance (about 3 mins faster for 1GB worth of
data, from 22 mins to 19 mins).

Is this a normal scenario? Both of my machine are running on Intel i7 core.


Regards,
Edwin


On 21 September 2015 at 16:24, Zheng Lin Edwin Yeo 
wrote:

> I'm not sure if that is because currently my machine is a normal PC and
> not a server, but my CPU specification for each of the core is Intel(R)
> Core(TM) i7-4910MQ CPU @ 2.90GHz.
>
> It should probably be better when the real server which has a much better
> specification comes, and I should be able to do the indexing in a lesser
> time using the knowledge that I've learnt here.
>
>
> Regards,
> Edwin
>
>
>
> On 21 September 2015 at 16:00, Toke Eskildsen 
> wrote:
>
>> On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote:
>> > I didn't find any increase in indexing throughput by adding shards in
>> the
>> > same machine.
>> >
>> > However, I've managed to feed the index to Solr from more than one
>> thread
>> > at a time. It can take up to 3 threads without affecting the indexing
>> > speed. Anything more than that, the CPU will hit 100%, and the indexing
>> > speed in all the threads will be reduced.
>>
>> It is a bit surprising that the limit is 3 Threads on an 8 core machine,
>> but I am happy to hear that your findings fit the overall theory.
>>
>>
>> Thank you for the verification,
>> Toke Eskildsen, State and University Library, Denmark
>>
>>
>>
>


Re: Does more shards in core improve performance?

2015-09-21 Thread Zheng Lin Edwin Yeo
I'm not sure if that is because currently my machine is a normal PC and not
a server, but my CPU specification for each of the core is Intel(R)
Core(TM) i7-4910MQ CPU @ 2.90GHz.

It should probably be better when the real server which has a much better
specification comes, and I should be able to do the indexing in a lesser
time using the knowledge that I've learnt here.


Regards,
Edwin



On 21 September 2015 at 16:00, Toke Eskildsen 
wrote:

> On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote:
> > I didn't find any increase in indexing throughput by adding shards in the
> > same machine.
> >
> > However, I've managed to feed the index to Solr from more than one thread
> > at a time. It can take up to 3 threads without affecting the indexing
> > speed. Anything more than that, the CPU will hit 100%, and the indexing
> > speed in all the threads will be reduced.
>
> It is a bit surprising that the limit is 3 Threads on an 8 core machine,
> but I am happy to hear that your findings fit the overall theory.
>
>
> Thank you for the verification,
> Toke Eskildsen, State and University Library, Denmark
>
>
>


Re: Does more shards in core improve performance?

2015-09-21 Thread Toke Eskildsen
On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote:
> I didn't find any increase in indexing throughput by adding shards in the
> same machine.
> 
> However, I've managed to feed the index to Solr from more than one thread
> at a time. It can take up to 3 threads without affecting the indexing
> speed. Anything more than that, the CPU will hit 100%, and the indexing
> speed in all the threads will be reduced.

It is a bit surprising that the limit is 3 Threads on an 8 core machine,
but I am happy to hear that your findings fit the overall theory.


Thank you for the verification,
Toke Eskildsen, State and University Library, Denmark




Re: Does more shards in core improve performance?

2015-09-20 Thread Zheng Lin Edwin Yeo
I didn't find any increase in indexing throughput by adding shards in the
same machine.

However, I've managed to feed the index to Solr from more than one thread
at a time. It can take up to 3 threads without affecting the indexing
speed. Anything more than that, the CPU will hit 100%, and the indexing
speed in all the threads will be reduced.

Regards,
Edwin


On 18 September 2015 at 19:38, Gili Nachum  wrote:

> If cpu is just 50% and adding a shard does increase indexing throughput
> then check for disk bottleneck.
> On Sep 17, 2015 18:19, "Zheng Lin Edwin Yeo"  wrote:
>
> > Thank you everyone for your reply.
> >
> > > How many CPUs on that machine? How many other requests using the
> server?
> >
> > A) There's 8 CPU on the machine, and there is no other requests that's
> > using the server. Only the indexing script is running.
> >
> > > A simple metric is to look at CPU usage on the machine: If it is near
> > 100% when you index, you will need extra hardware to get more speed.
> > If it is substantially less than 100%, then feed Solr from more than one
> > thread at a time.
> >
> > A) So far from what I observe, the CPU usage is usually around 50% to
> 70%.
> > It haven't go up to 100% yet. But I'll probably try to do sharing on a
> > different machine, as that is probably the case for the real production
> > server.
> >
> >
> > Regards,
> > Edwin
> >
> >
> > On 17 September 2015 at 19:55, Toke Eskildsen 
> > wrote:
> >
> > > On Thu, 2015-09-17 at 16:58 +0800, Zheng Lin Edwin Yeo wrote:
> > >
> > > > I was trying with 2 shards and 4 shards but all on the same machine,
> > > > and they have the same performance (no improvement in performance) as
> > > > the one with 1 shard. My machine has a 32GB RAM.
> > >
> > > As you are testing indexing speed, Shalin's post is spot-on: Sharding
> on
> > > the same machine won't help you. I just added my comment on search to
> > > help build a complete picture.
> > >
> > > A simple metric is to look at CPU usage on the machine: If it is near
> > > 100% when you index, you will need extra hardware to get more speed.
> > > If it is substantially less than 100%, then feed Solr from more than
> one
> > > thread at a time.
> > >
> > > - Toke Eskildsen, State and University Library, Denmark
> > >
> > >
> > >
> > >
> >
>


Re: Does more shards in core improve performance?

2015-09-18 Thread Gili Nachum
If cpu is just 50% and adding a shard does increase indexing throughput
then check for disk bottleneck.
On Sep 17, 2015 18:19, "Zheng Lin Edwin Yeo"  wrote:

> Thank you everyone for your reply.
>
> > How many CPUs on that machine? How many other requests using the server?
>
> A) There's 8 CPU on the machine, and there is no other requests that's
> using the server. Only the indexing script is running.
>
> > A simple metric is to look at CPU usage on the machine: If it is near
> 100% when you index, you will need extra hardware to get more speed.
> If it is substantially less than 100%, then feed Solr from more than one
> thread at a time.
>
> A) So far from what I observe, the CPU usage is usually around 50% to 70%.
> It haven't go up to 100% yet. But I'll probably try to do sharing on a
> different machine, as that is probably the case for the real production
> server.
>
>
> Regards,
> Edwin
>
>
> On 17 September 2015 at 19:55, Toke Eskildsen 
> wrote:
>
> > On Thu, 2015-09-17 at 16:58 +0800, Zheng Lin Edwin Yeo wrote:
> >
> > > I was trying with 2 shards and 4 shards but all on the same machine,
> > > and they have the same performance (no improvement in performance) as
> > > the one with 1 shard. My machine has a 32GB RAM.
> >
> > As you are testing indexing speed, Shalin's post is spot-on: Sharding on
> > the same machine won't help you. I just added my comment on search to
> > help build a complete picture.
> >
> > A simple metric is to look at CPU usage on the machine: If it is near
> > 100% when you index, you will need extra hardware to get more speed.
> > If it is substantially less than 100%, then feed Solr from more than one
> > thread at a time.
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >
> >
>


Re: Does more shards in core improve performance?

2015-09-17 Thread Zheng Lin Edwin Yeo
Thank you everyone for your reply.

> How many CPUs on that machine? How many other requests using the server?

A) There's 8 CPU on the machine, and there is no other requests that's
using the server. Only the indexing script is running.

> A simple metric is to look at CPU usage on the machine: If it is near
100% when you index, you will need extra hardware to get more speed.
If it is substantially less than 100%, then feed Solr from more than one
thread at a time.

A) So far from what I observe, the CPU usage is usually around 50% to 70%.
It haven't go up to 100% yet. But I'll probably try to do sharing on a
different machine, as that is probably the case for the real production
server.


Regards,
Edwin


On 17 September 2015 at 19:55, Toke Eskildsen 
wrote:

> On Thu, 2015-09-17 at 16:58 +0800, Zheng Lin Edwin Yeo wrote:
>
> > I was trying with 2 shards and 4 shards but all on the same machine,
> > and they have the same performance (no improvement in performance) as
> > the one with 1 shard. My machine has a 32GB RAM.
>
> As you are testing indexing speed, Shalin's post is spot-on: Sharding on
> the same machine won't help you. I just added my comment on search to
> help build a complete picture.
>
> A simple metric is to look at CPU usage on the machine: If it is near
> 100% when you index, you will need extra hardware to get more speed.
> If it is substantially less than 100%, then feed Solr from more than one
> thread at a time.
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>
>


Re: Does more shards in core improve performance?

2015-09-17 Thread Toke Eskildsen
On Thu, 2015-09-17 at 16:58 +0800, Zheng Lin Edwin Yeo wrote:

> I was trying with 2 shards and 4 shards but all on the same machine,
> and they have the same performance (no improvement in performance) as
> the one with 1 shard. My machine has a 32GB RAM.

As you are testing indexing speed, Shalin's post is spot-on: Sharding on
the same machine won't help you. I just added my comment on search to
help build a complete picture.

A simple metric is to look at CPU usage on the machine: If it is near
100% when you index, you will need extra hardware to get more speed.
If it is substantially less than 100%, then feed Solr from more than one
thread at a time.

- Toke Eskildsen, State and University Library, Denmark





Re: Does more shards in core improve performance?

2015-09-17 Thread Upayavira
How many CPUs on that machine? How many other requests using the server?

On Thu, Sep 17, 2015, at 09:58 AM, Zheng Lin Edwin Yeo wrote:
> Thanks for the information.
> 
> I was trying with 2 shards and 4 shards but all on the same machine, and
> they have the same performance (no improvement in performance) as the one
> with 1 shard. My machine has a 32GB RAM.
> 
> Probably I should try one of the shard in different machine and see how
> it
> goes?
> 
> Regards,
> Edwin
> 
> 
> On 17 September 2015 at 15:37, Toke Eskildsen 
> wrote:
> 
> > On Thu, 2015-09-17 at 12:04 +0530, Shalin Shekhar Mangar wrote:
> > > Yes, of course, the only reason to have more shards is so that they
> > > can reside on different machines (or use different disks, assuming you
> > > have enough CPU/memory etc) so that you can scale your indexing
> > > throughput.
> >
> > For indexing, true. Due to Solr's 1-request-1-thread nature, sharding on
> > the same hardware can be used to lower latency for CPU-heavy searches.
> >
> > We are running 25 shards/machine, where the machines has 16HT CPU-cores.
> > Granted we also do it due to the pesky 2 billion limit, but the result
> > is that the CPU-cores are nicely utilized with our low queries/second
> > usage pattern.
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >


Re: Does more shards in core improve performance?

2015-09-17 Thread Zheng Lin Edwin Yeo
Thanks for the information.

I was trying with 2 shards and 4 shards but all on the same machine, and
they have the same performance (no improvement in performance) as the one
with 1 shard. My machine has a 32GB RAM.

Probably I should try one of the shard in different machine and see how it
goes?

Regards,
Edwin


On 17 September 2015 at 15:37, Toke Eskildsen 
wrote:

> On Thu, 2015-09-17 at 12:04 +0530, Shalin Shekhar Mangar wrote:
> > Yes, of course, the only reason to have more shards is so that they
> > can reside on different machines (or use different disks, assuming you
> > have enough CPU/memory etc) so that you can scale your indexing
> > throughput.
>
> For indexing, true. Due to Solr's 1-request-1-thread nature, sharding on
> the same hardware can be used to lower latency for CPU-heavy searches.
>
> We are running 25 shards/machine, where the machines has 16HT CPU-cores.
> Granted we also do it due to the pesky 2 billion limit, but the result
> is that the CPU-cores are nicely utilized with our low queries/second
> usage pattern.
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>


Re: Does more shards in core improve performance?

2015-09-17 Thread Toke Eskildsen
On Thu, 2015-09-17 at 12:04 +0530, Shalin Shekhar Mangar wrote:
> Yes, of course, the only reason to have more shards is so that they
> can reside on different machines (or use different disks, assuming you
> have enough CPU/memory etc) so that you can scale your indexing
> throughput.

For indexing, true. Due to Solr's 1-request-1-thread nature, sharding on
the same hardware can be used to lower latency for CPU-heavy searches.

We are running 25 shards/machine, where the machines has 16HT CPU-cores.
Granted we also do it due to the pesky 2 billion limit, but the result
is that the CPU-cores are nicely utilized with our low queries/second
usage pattern.

- Toke Eskildsen, State and University Library, Denmark




Re: Does more shards in core improve performance?

2015-09-16 Thread Shalin Shekhar Mangar
Yes, of course, the only reason to have more shards is so that they
can reside on different machines (or use different disks, assuming you
have enough CPU/memory etc) so that you can scale your indexing
throughput. Move one of them to a different machine and measure the
performance.

On Thu, Sep 17, 2015 at 11:32 AM, Zheng Lin Edwin Yeo
 wrote:
> Hi,
>
> Would like to check, does creating more shards for the core improve the
> overall performance? I'm using Solr 5.3.0.
>
> I tried the indexing for a core with 1 shard and another core with 2
> shards, but both are taking the same amount of time to do the indexing.
>
> Currently, both my shards are in the same machine. Will the performance be
> improved if the shards are located in different machine?
>
>
> Regards,
> Edwin



-- 
Regards,
Shalin Shekhar Mangar.