Re: Poll: Master-Slave or SolrCloud?

2017-04-30 Thread Ganesh M
We use zookeeper for Hadoop / HBase and so we use same ensemble for Solr
too. We are using Solr Cloud in EC2 instances with 6 collections containing
4 shards and 2 replicas.

We followed the one of the blog

in the internet for our setup and it's works fine. Though the setup is on
tomcat, for latest  solr version with Jetty can also be used with little
change.

Hope this is useful.

Regards,




On Sun, Apr 30, 2017 at 9:06 PM Shawn Heisey  wrote:

> On 4/25/2017 3:13 PM, Otis Gospodnetić wrote:
> > Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
> > It would be equivalent of dedicated Elasticsearch nodes, which is the
> > current ES best practice/recommendation.  I've never heard of anyone
> being
> > scared of running 3 dedicated master ES nodes, so if SolrCloud offered
> the
> > same, perhaps even completely hiding ZK from users, that would present
> the
> > same level of complexity (err, simplicity) ES users love about ES.  Don't
> > want to talk about SolrCloud vs. ES here at all, just trying to share
> > observations since we work a lot with both Elasticsearch and Solr(Cloud)
> at
> > Sematext.
>
> Yes, you could do that ... but I don't see any real value right now.
> You have to learn how to configure a redundant ZK ensemble and apply
> that configuration to the embedded servers manually.  Since that's not
> any different from what you'd do with an external ensemble, I think it's
> better to just use the external install.  As I understand it, elastic
> wrote their cluster code themselves ... it's part of ES, not provided by
> a separate software package, so their recommendation makes sense for ES.
>
> Using embedded ZK as you have described, there will be at least three
> extra Solr nodes that are not intended to host collections.  To keep it
> running this way, it will be important to explicitly avoid putting new
> collections on those nodes, because that won't happen by default.  With
> dedicated external ZK processes, there's no Solr node to worry about,
> and no need to create a "master node" capability.
>
> I'm not opposed to automated scripts included with Solr to configure and
> start standalone ZK processes, including a way to create an init
> script.  That would be very useful and go a long way towards extremely
> easy instructions for setting up a fault tolerant SolrCloud installation
> on multiple servers.
>
> In situations where ZK is installed on dedicated hardware, a native ZK
> will require less heap memory than one embedded in Solr, and probably
> will have slightly lower CPU requirements.
>
> SOLR-9386 does make your idea more viable because it brings the full
> capability of recent zookeeper configuration options to the embedded
> server.  It will be available in version 6.6.
>
> Thanks,
> Shawn
>
>


Re: Poll: Master-Slave or SolrCloud?

2017-04-30 Thread Shawn Heisey
On 4/25/2017 3:13 PM, Otis Gospodnetić wrote:
> Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
> It would be equivalent of dedicated Elasticsearch nodes, which is the
> current ES best practice/recommendation.  I've never heard of anyone being
> scared of running 3 dedicated master ES nodes, so if SolrCloud offered the
> same, perhaps even completely hiding ZK from users, that would present the
> same level of complexity (err, simplicity) ES users love about ES.  Don't
> want to talk about SolrCloud vs. ES here at all, just trying to share
> observations since we work a lot with both Elasticsearch and Solr(Cloud) at
> Sematext.

Yes, you could do that ... but I don't see any real value right now. 
You have to learn how to configure a redundant ZK ensemble and apply
that configuration to the embedded servers manually.  Since that's not
any different from what you'd do with an external ensemble, I think it's
better to just use the external install.  As I understand it, elastic
wrote their cluster code themselves ... it's part of ES, not provided by
a separate software package, so their recommendation makes sense for ES.

Using embedded ZK as you have described, there will be at least three
extra Solr nodes that are not intended to host collections.  To keep it
running this way, it will be important to explicitly avoid putting new
collections on those nodes, because that won't happen by default.  With
dedicated external ZK processes, there's no Solr node to worry about,
and no need to create a "master node" capability.

I'm not opposed to automated scripts included with Solr to configure and
start standalone ZK processes, including a way to create an init
script.  That would be very useful and go a long way towards extremely
easy instructions for setting up a fault tolerant SolrCloud installation
on multiple servers.

In situations where ZK is installed on dedicated hardware, a native ZK
will require less heap memory than one embedded in Solr, and probably
will have slightly lower CPU requirements.

SOLR-9386 does make your idea more viable because it brings the full
capability of recent zookeeper configuration options to the embedded
server.  It will be available in version 6.6.

Thanks,
Shawn



Re: Poll: Master-Slave or SolrCloud?

2017-04-30 Thread Yonik Seeley
On Tue, Apr 25, 2017 at 1:33 PM, Otis Gospodnetić
 wrote:
> I think I saw mentions (maybe on user or dev MLs or JIRA) about
> potentially, in the future, there only being SolrCloud mode (and dropping
> SolrCloud name in favour of Solr).

I personally never saw this actually happening, and not because of any
complexity issues with "getting started with SolrCloud", although I
think continuing improvements there are a good thing.

Many times, I see these two things conflated:
1) how easy it is to get SolrCloud set up
2) the inherent internal complexity of a system

We can always improve #1, but that does not imply improvement in #2
(and may actually increase internal complexity).

A system where you can just fire up a node pointed at a directory and
not worry about any shared state is very easy to understand, debug,
hack around, and build very complex custom systems around.

-Yonik


RE: Poll: Master-Slave or SolrCloud?

2017-04-28 Thread Davis, Daniel (NIH/NLM) [C]
I am also very surprised.  Even though I am no longer using my 
solr-config-tool, the main thing I like about SolrCloud is how easy it is to 
bring up a new collection and set up the schema and fields that you want.   I 
also like that I don't need to manage replication in the solr configuration.

-Original Message-
From: Rick Leir [mailto:rl...@leirtech.com] 
Sent: Friday, April 28, 2017 12:34 PM
To: solr-user@lucene.apache.org
Subject: Re: Poll: Master-Slave or SolrCloud?

Shawn,
Would you consider writing this up in a blog?
Thanks -- Rick

On April 28, 2017 11:04:02 AM EDT, Shawn Heisey <apa...@elyograg.org> wrote:
>On 4/24/2017 8:58 AM, Otis Gospodnetić wrote:
>> I'm really really surprised here.  Back in 2013 we did a poll to see
>how
>> people were running Master-Slave (4.x back then) and SolrCloud was a
>bit
>> more popular than Master-Slave:
>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>
>> Here is a fresh new poll with pretty much the same question - How do
>you
>> run your Solr?
><https://twitter.com/sematext/status/854927627748036608> -
>> and guess what?  SolrCloud is *not* at all a lot more prevalent than 
>> Master-Slave.
>
>I don't use *either* for my primary Solr installs.  My indexes are 
>distributed, with sharding maintained by my indexing code.  Each copy 
>of the index is independently updated, rather than relying on Solr 
>features to replicate it.  There are things I can do with this setup 
>that would be much more difficult (and maybe impossible) with either 
>SolrCloud or master-slave replication.
>
>Thanks,
>Shawn

--
Sorry for being brief. Alternate email is rickleir at yahoo dot com 


Re: Poll: Master-Slave or SolrCloud?

2017-04-28 Thread Rick Leir
Shawn,
Would you consider writing this up in a blog?
Thanks -- Rick

On April 28, 2017 11:04:02 AM EDT, Shawn Heisey  wrote:
>On 4/24/2017 8:58 AM, Otis Gospodnetić wrote:
>> I'm really really surprised here.  Back in 2013 we did a poll to see
>how
>> people were running Master-Slave (4.x back then) and SolrCloud was a
>bit
>> more popular than Master-Slave:
>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>
>> Here is a fresh new poll with pretty much the same question - How do
>you
>> run your Solr?
> -
>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>> Master-Slave.
>
>I don't use *either* for my primary Solr installs.  My indexes are
>distributed, with sharding maintained by my indexing code.  Each copy
>of
>the index is independently updated, rather than relying on Solr
>features
>to replicate it.  There are things I can do with this setup that would
>be much more difficult (and maybe impossible) with either SolrCloud or
>master-slave replication.
>
>Thanks,
>Shawn

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Poll: Master-Slave or SolrCloud?

2017-04-28 Thread Shawn Heisey
On 4/24/2017 8:58 AM, Otis Gospodnetić wrote:
> I'm really really surprised here.  Back in 2013 we did a poll to see how
> people were running Master-Slave (4.x back then) and SolrCloud was a bit
> more popular than Master-Slave:
> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>
> Here is a fresh new poll with pretty much the same question - How do you
> run your Solr?  -
> and guess what?  SolrCloud is *not* at all a lot more prevalent than
> Master-Slave.

I don't use *either* for my primary Solr installs.  My indexes are
distributed, with sharding maintained by my indexing code.  Each copy of
the index is independently updated, rather than relying on Solr features
to replicate it.  There are things I can do with this setup that would
be much more difficult (and maybe impossible) with either SolrCloud or
master-slave replication.

Thanks,
Shawn



Re: Poll: Master-Slave or SolrCloud?

2017-04-28 Thread Charlie Hull
Like Sematext, we help clients with both ES and Solr. A particular 
difference is that ES is easier to start with (lots of sensible 
defaults) but then once you have got going (and perchance have thrown 
many millions of items at it) you can run into trouble because you don't 
really understand what's happening under the hood. Solr is still harder 
to get started with but as you're going to have to figure it out anyway 
to get anywhere, you shouldn't end up having to rethink or redo 
everything later.


Charlie

On 28/04/2017 04:51, David Lee wrote:

As someone who moved from ES to Solr, I can say that one of the things
that makes ES so much easier to configure is that the majority of things
that need to be set for a specific environment are all in pretty much
one config file. Also, I didn't have to deal with the "magic stuff" that
many people have talked about where SolrCloud is concerned.

One of the problems is also do to documentation and user blogs that
discuss how to use SolrCloud. They all tell you how to create a config
to run SolrCloud on one system using the -e cloud flag, but then that's
it. They all seem to avoid discussions of what to do from there in terms
of best practices in distributing to other nodes. It's out there, but in
many cases the guides refer to older versions of Solr so sometimes it is
hard to know what versions people are writing about until you try their
solutions and nothing works, so you finally figure out they are talking
about a much older version.

I moved away from ES to Solr because I prefer the openness of Solr and
the community participation but I really haven't been very successful in
deploying this in a production environment at this point.

I'd say the two things I find that I'm battling with the most are the
cloud configuration and the work I'm having to do to get even the most
basic JSON documents indexed correctly (specifically where I need block
joins, etc.).

I'm hopeful that the V2 Api will help with the JSON issue, but it would
be nice to have some documentation that goes more in-depth on how to set
up additional nodes. Also, even though I use ZK for other parts of my
application, I have no problem with a version running specifically for
Solr if it makes this process more straight-forward.

David



On 4/27/2017 2:51 AM, Emir Arnautovic wrote:

I think creating poll for ES ppl with question: "How do you run master
nodes? A) on some data nodes B) dedicated node C) dedicated server"
would give some insight how big issue is having ZK and if hiding ZK
behind Solr would do any good.

Emir


On 25.04.2017 23:13, Otis Gospodnetić wrote:

Hi Erick,

Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
It would be equivalent of dedicated Elasticsearch nodes, which is the
current ES best practice/recommendation.  I've never heard of anyone
being
scared of running 3 dedicated master ES nodes, so if SolrCloud
offered the
same, perhaps even completely hiding ZK from users, that would
present the
same level of complexity (err, simplicity) ES users love about ES.
Don't
want to talk about SolrCloud vs. ES here at all, just trying to share
observations since we work a lot with both Elasticsearch and
Solr(Cloud) at
Sematext.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:03 PM, Erick Erickson

wrote:


bq: I read somewhere that you should run your own ZK externally, and
turn off SolrCloud

this is a bit confused. "turn off SolrCloud" has nothing to do with
running ZK internally or externally. SolrCloud requires ZK, whether
internal or external is irrelevant to the term SolrCloud.

On to running an external ZK ensemble. Mostly, that's administratively
by far the safest. If you're running the embedded ZK, then the ZK
instances are tied to your Solr instance. Now if, for any reason, your
Solr nodes hosting ZK go down, you lose ZK quorum, can't index.
etc

Now consider a cluster with, say, 100 Solr nodes. Not talking replicas
in a collection here, I'm talking 100 physical machines. BTW, this is
not even close to the largest ones I'm aware of. Which three (for
example) are running ZK? If I want to upgrade Solr I better make
really sure not to upgrade to of the Solr instances running ZK at once
if I want my cluster to keep going

And, ZK is sensitive to system resources. So putting ZK on a Solr node
then hosing, say, updates to my Solr cluster can cause ZK to be
starved for resources.

This is one of those deals where _functionally_, it's OK to run
embedded ZK, but administratively it's suspect.

Best,
Erick

On Tue, Apr 25, 2017 at 10:49 AM, Rick Leir  wrote:

All,
I read somewhere that you should run your own ZK externally, and turn

off SolrCloud. Comments please!

Rick

On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" <

otis.gospodne...@gmail.com> wrote:

This is interesting - that 

Re: Poll: Master-Slave or SolrCloud?

2017-04-27 Thread David Lee
As someone who moved from ES to Solr, I can say that one of the things 
that makes ES so much easier to configure is that the majority of things 
that need to be set for a specific environment are all in pretty much 
one config file. Also, I didn't have to deal with the "magic stuff" that 
many people have talked about where SolrCloud is concerned.


One of the problems is also do to documentation and user blogs that 
discuss how to use SolrCloud. They all tell you how to create a config 
to run SolrCloud on one system using the -e cloud flag, but then that's 
it. They all seem to avoid discussions of what to do from there in terms 
of best practices in distributing to other nodes. It's out there, but in 
many cases the guides refer to older versions of Solr so sometimes it is 
hard to know what versions people are writing about until you try their 
solutions and nothing works, so you finally figure out they are talking 
about a much older version.


I moved away from ES to Solr because I prefer the openness of Solr and 
the community participation but I really haven't been very successful in 
deploying this in a production environment at this point.


I'd say the two things I find that I'm battling with the most are the 
cloud configuration and the work I'm having to do to get even the most 
basic JSON documents indexed correctly (specifically where I need block 
joins, etc.).


I'm hopeful that the V2 Api will help with the JSON issue, but it would 
be nice to have some documentation that goes more in-depth on how to set 
up additional nodes. Also, even though I use ZK for other parts of my 
application, I have no problem with a version running specifically for 
Solr if it makes this process more straight-forward.


David



On 4/27/2017 2:51 AM, Emir Arnautovic wrote:
I think creating poll for ES ppl with question: "How do you run master 
nodes? A) on some data nodes B) dedicated node C) dedicated server" 
would give some insight how big issue is having ZK and if hiding ZK 
behind Solr would do any good.


Emir


On 25.04.2017 23:13, Otis Gospodnetić wrote:

Hi Erick,

Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
It would be equivalent of dedicated Elasticsearch nodes, which is the
current ES best practice/recommendation.  I've never heard of anyone 
being
scared of running 3 dedicated master ES nodes, so if SolrCloud 
offered the
same, perhaps even completely hiding ZK from users, that would 
present the
same level of complexity (err, simplicity) ES users love about ES.  
Don't

want to talk about SolrCloud vs. ES here at all, just trying to share
observations since we work a lot with both Elasticsearch and 
Solr(Cloud) at

Sematext.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:03 PM, Erick Erickson 


wrote:


bq: I read somewhere that you should run your own ZK externally, and
turn off SolrCloud

this is a bit confused. "turn off SolrCloud" has nothing to do with
running ZK internally or externally. SolrCloud requires ZK, whether
internal or external is irrelevant to the term SolrCloud.

On to running an external ZK ensemble. Mostly, that's administratively
by far the safest. If you're running the embedded ZK, then the ZK
instances are tied to your Solr instance. Now if, for any reason, your
Solr nodes hosting ZK go down, you lose ZK quorum, can't index.
etc

Now consider a cluster with, say, 100 Solr nodes. Not talking replicas
in a collection here, I'm talking 100 physical machines. BTW, this is
not even close to the largest ones I'm aware of. Which three (for
example) are running ZK? If I want to upgrade Solr I better make
really sure not to upgrade to of the Solr instances running ZK at once
if I want my cluster to keep going

And, ZK is sensitive to system resources. So putting ZK on a Solr node
then hosing, say, updates to my Solr cluster can cause ZK to be
starved for resources.

This is one of those deals where _functionally_, it's OK to run
embedded ZK, but administratively it's suspect.

Best,
Erick

On Tue, Apr 25, 2017 at 10:49 AM, Rick Leir  wrote:

All,
I read somewhere that you should run your own ZK externally, and turn

off SolrCloud. Comments please!

Rick

On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" <

otis.gospodne...@gmail.com> wrote:
This is interesting - that ZK is seen as adding so much complexity 
that

it
turns people off!

If you think about it, Elasticsearch users have no choice -- except
their
"ZK" is built-in, hidden, so one doesn't have to think about it, at
least
not initially.

I think I saw mentions (maybe on user or dev MLs or JIRA) about
potentially, in the future, there only being SolrCloud mode (and
dropping
SolrCloud name in favour of Solr).  If the above comment from Charlie
about
complexity is really true for Solr users, and if that's the reason 
why

Re: Poll: Master-Slave or SolrCloud?

2017-04-27 Thread Emir Arnautovic
I think creating poll for ES ppl with question: "How do you run master 
nodes? A) on some data nodes B) dedicated node C) dedicated server" 
would give some insight how big issue is having ZK and if hiding ZK 
behind Solr would do any good.


Emir


On 25.04.2017 23:13, Otis Gospodnetić wrote:

Hi Erick,

Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
It would be equivalent of dedicated Elasticsearch nodes, which is the
current ES best practice/recommendation.  I've never heard of anyone being
scared of running 3 dedicated master ES nodes, so if SolrCloud offered the
same, perhaps even completely hiding ZK from users, that would present the
same level of complexity (err, simplicity) ES users love about ES.  Don't
want to talk about SolrCloud vs. ES here at all, just trying to share
observations since we work a lot with both Elasticsearch and Solr(Cloud) at
Sematext.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:03 PM, Erick Erickson 
wrote:


bq: I read somewhere that you should run your own ZK externally, and
turn off SolrCloud

this is a bit confused. "turn off SolrCloud" has nothing to do with
running ZK internally or externally. SolrCloud requires ZK, whether
internal or external is irrelevant to the term SolrCloud.

On to running an external ZK ensemble. Mostly, that's administratively
by far the safest. If you're running the embedded ZK, then the ZK
instances are tied to your Solr instance. Now if, for any reason, your
Solr nodes hosting ZK go down, you lose ZK quorum, can't index.
etc

Now consider a cluster with, say, 100 Solr nodes. Not talking replicas
in a collection here, I'm talking 100 physical machines. BTW, this is
not even close to the largest ones I'm aware of. Which three (for
example) are running ZK? If I want to upgrade Solr I better make
really sure not to upgrade to of the Solr instances running ZK at once
if I want my cluster to keep going

And, ZK is sensitive to system resources. So putting ZK on a Solr node
then hosing, say, updates to my Solr cluster can cause ZK to be
starved for resources.

This is one of those deals where _functionally_, it's OK to run
embedded ZK, but administratively it's suspect.

Best,
Erick

On Tue, Apr 25, 2017 at 10:49 AM, Rick Leir  wrote:

All,
I read somewhere that you should run your own ZK externally, and turn

off SolrCloud. Comments please!

Rick

On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" <

otis.gospodne...@gmail.com> wrote:

This is interesting - that ZK is seen as adding so much complexity that
it
turns people off!

If you think about it, Elasticsearch users have no choice -- except
their
"ZK" is built-in, hidden, so one doesn't have to think about it, at
least
not initially.

I think I saw mentions (maybe on user or dev MLs or JIRA) about
potentially, in the future, there only being SolrCloud mode (and
dropping
SolrCloud name in favour of Solr).  If the above comment from Charlie
about
complexity is really true for Solr users, and if that's the reason why
we
see so few people running SolrCloud today, perhaps that's a good signal
for
Solr development/priorities in terms of ZK
hiding/automating/embedding/something...

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull 
wrote:


On 24/04/2017 15:58, Otis Gospodnetić wrote:


Hi,

I'm really really surprised here.  Back in 2013 we did a poll to see

how

people were running Master-Slave (4.x back then) and SolrCloud was a

bit

more popular than Master-Slave:
https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/

Here is a fresh new poll with pretty much the same question - How do

you

run your Solr?

 -

and guess what?  SolrCloud is *not* at all a lot more prevalent than
Master-Slave.

We definitely see a lot more SolrCloud used by Sematext Solr
consulting/support customers, so I'm a bit surprised by the results

of

this
poll so far.


I'm not particularly surprised. We regularly see clients either with
single nodes or elderly versions of Solr (or even Lucene). Zookeeper

is

still seen as a bit of a black art. Once you move from 'how do I run

a

search engine' to 'how do I manage a cluster of servers with scaling

for

performance/resilience/failover' you're looking at a completely new

set

of skills and challenges, which I think puts many people off.

Charlie


Is anyone else surprised by this?  See https://twitter.com/sematext/
status/854927627748036608

Thanks,
Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training -

http://sematext.com/


---
This email has been checked for viruses by AVG.

Re: Poll: Master-Slave or SolrCloud?

2017-04-26 Thread Erick Erickson
Steve:

You might be interested in:
https://issues.apache.org/jira/browse/SOLR-10233, please comment on
whether that JIRA is along the lines you're thinking.

Best,
Erick

On Wed, Apr 26, 2017 at 6:35 AM, Stephen Weiss  wrote:
> We run both, and we are running the latest versions for both.  There are 
> different use cases for each one.  Where we are using solrcloud, it only has 
> to operate in one datacenter, and sharding is incredibly important because we 
> have billions and billions of documents.  In a separate group of servers, we 
> use master/slave in a cascade that runs through multiple datacenters, has 
> relatively small indexes that don't really need to be sharded, and we need to 
> be able to add and remove servers at a moment's notice, which really is not 
> that simple to do with SolrCloud.
>
> I wouldn't assume that people who are using replication are all stuck in the 
> past and not using the cloud version out of some luddite aversion to software 
> upgrades.  SolrCloud's feature set doesn't allow for everything you can do 
> with replication, just as replication doesn't allow for everything you can do 
> with SolrCloud.  Personally, I would love it if there were some kind of 
> hybrid model (ie, a cloud that could replicate to another cloud), but that 
> doesn't exist.  Even if it did though, I would probably still use vanilla 
> replication in certain contexts.
>
> --
> Steve
>
> On Mon, Apr 24, 2017 at 10:58 PM, Erick Erickson 
> > wrote:
> Otis:
>
> bq: But it doesn't really matter so much whether people are the same or not
>
> I'm going to gently disagree here. I regularly see questions on the
> user's list about upgrading from 4.x or 3.x (!). So if the sample of
> users responding to your poll are substantially the same users as
> responded in 2013, there's no guarantee that they've even upgraded
> Solr, much less thought it worthwhile to change their paradigm.
>
> I suppose an interesting bit of additional data would be "when did you
> start using Solr?". Would there be a greater percentage of responders
> using SolrCloud in 2014 .vs. 2013? 2015 .vs. 2014? and so on.
>
> Mind you I have zero data to support any of this, it's speculation and
> I haven't looked at the poll so maybe I'm off base
>
> Erick
>
> On Mon, Apr 24, 2017 at 7:29 PM, Otis Gospodnetić
> > wrote:
>> Hi,
>>
>> I think it's roughly the same profile of people.  The poll from 2013 was on
>> Sematext blog and the new one is on Sematext Twitter account.  But it
>> doesn't really matter so much whether people are the same or not.  What
>> amazes me that in 2017 we don't see a lot more SolrCloud users!
>>
>> Otis
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>> On Mon, Apr 24, 2017 at 8:04 PM, Erick Erickson 
>> >
>> wrote:
>>
>>> Yeah, this is kind of counter to my expectations too. I guess my
>>> question is whether the same people are responding to the new survey
>>> as the old one. "If it ain't broke" and all that.
>>>
>>> Erick
>>>
>>> On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
>>> > wrote:
>>> > Hi,
>>> >
>>> > I'm really really surprised here.  Back in 2013 we did a poll to see how
>>> > people were running Master-Slave (4.x back then) and SolrCloud was a bit
>>> > more popular than Master-Slave:
>>> > https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>> >
>>> > Here is a fresh new poll with pretty much the same question - How do you
>>> > run your Solr? 
>>> -
>>> > and guess what?  SolrCloud is *not* at all a lot more prevalent than
>>> > Master-Slave.
>>> >
>>> > We definitely see a lot more SolrCloud used by Sematext Solr
>>> > consulting/support customers, so I'm a bit surprised by the results of
>>> this
>>> > poll so far.
>>> >
>>> > Is anyone else surprised by this?  See https://twitter.com/sematext/
>>> > status/854927627748036608
>>> >
>>> > Thanks,
>>> > Otis
>>> > --
>>> > Monitoring - Log Management - Alerting - Anomaly Detection
>>> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>
>
>
> WGSN (www.wgsn.com) is the world’s leading trend authority for creative 
> thinkers in over 94 countries. Our services cover consumer insights, fashion 
> and lifestyle forecasting, data analytics, crowd-sourced design validation 
> and expert consulting. We help drive our customers to greater success. 
> Together, we Create Tomorrow.
>
> WGSN is part of WGSN Limited, comprising of market-leading products including 
> WGSN Insight, WGSN Fashion, WGSN Instock, WGSN Lifestyle & Interiors, WGSN 
> Styletrial and WGSN Mindset, our bespoke consultancy 

Re: Poll: Master-Slave or SolrCloud?

2017-04-26 Thread Stephen Weiss
We run both, and we are running the latest versions for both.  There are 
different use cases for each one.  Where we are using solrcloud, it only has to 
operate in one datacenter, and sharding is incredibly important because we have 
billions and billions of documents.  In a separate group of servers, we use 
master/slave in a cascade that runs through multiple datacenters, has 
relatively small indexes that don't really need to be sharded, and we need to 
be able to add and remove servers at a moment's notice, which really is not 
that simple to do with SolrCloud.

I wouldn't assume that people who are using replication are all stuck in the 
past and not using the cloud version out of some luddite aversion to software 
upgrades.  SolrCloud's feature set doesn't allow for everything you can do with 
replication, just as replication doesn't allow for everything you can do with 
SolrCloud.  Personally, I would love it if there were some kind of hybrid model 
(ie, a cloud that could replicate to another cloud), but that doesn't exist.  
Even if it did though, I would probably still use vanilla replication in 
certain contexts.

--
Steve

On Mon, Apr 24, 2017 at 10:58 PM, Erick Erickson 
> wrote:
Otis:

bq: But it doesn't really matter so much whether people are the same or not

I'm going to gently disagree here. I regularly see questions on the
user's list about upgrading from 4.x or 3.x (!). So if the sample of
users responding to your poll are substantially the same users as
responded in 2013, there's no guarantee that they've even upgraded
Solr, much less thought it worthwhile to change their paradigm.

I suppose an interesting bit of additional data would be "when did you
start using Solr?". Would there be a greater percentage of responders
using SolrCloud in 2014 .vs. 2013? 2015 .vs. 2014? and so on.

Mind you I have zero data to support any of this, it's speculation and
I haven't looked at the poll so maybe I'm off base

Erick

On Mon, Apr 24, 2017 at 7:29 PM, Otis Gospodnetić
> wrote:
> Hi,
>
> I think it's roughly the same profile of people.  The poll from 2013 was on
> Sematext blog and the new one is on Sematext Twitter account.  But it
> doesn't really matter so much whether people are the same or not.  What
> amazes me that in 2017 we don't see a lot more SolrCloud users!
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Mon, Apr 24, 2017 at 8:04 PM, Erick Erickson 
> >
> wrote:
>
>> Yeah, this is kind of counter to my expectations too. I guess my
>> question is whether the same people are responding to the new survey
>> as the old one. "If it ain't broke" and all that.
>>
>> Erick
>>
>> On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
>> > wrote:
>> > Hi,
>> >
>> > I'm really really surprised here.  Back in 2013 we did a poll to see how
>> > people were running Master-Slave (4.x back then) and SolrCloud was a bit
>> > more popular than Master-Slave:
>> > https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>> >
>> > Here is a fresh new poll with pretty much the same question - How do you
>> > run your Solr? 
>> -
>> > and guess what?  SolrCloud is *not* at all a lot more prevalent than
>> > Master-Slave.
>> >
>> > We definitely see a lot more SolrCloud used by Sematext Solr
>> > consulting/support customers, so I'm a bit surprised by the results of
>> this
>> > poll so far.
>> >
>> > Is anyone else surprised by this?  See https://twitter.com/sematext/
>> > status/854927627748036608
>> >
>> > Thanks,
>> > Otis
>> > --
>> > Monitoring - Log Management - Alerting - Anomaly Detection
>> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>



WGSN (www.wgsn.com) is the world’s leading trend authority for creative 
thinkers in over 94 countries. Our services cover consumer insights, fashion 
and lifestyle forecasting, data analytics, crowd-sourced design validation and 
expert consulting. We help drive our customers to greater success. Together, we 
Create Tomorrow.

WGSN is part of WGSN Limited, comprising of market-leading products including 
WGSN Insight, WGSN Fashion, WGSN Instock, WGSN Lifestyle & Interiors, WGSN 
Styletrial and WGSN Mindset, our bespoke consultancy services. WGSN is owned by 
Ascential plc, a leading international media company that informs and connects 
business professionals in 150 countries through market-leading Exhibitions and 
Festivals, and Information Services.

The information in or attached to this email is confidential and may be legally 
privileged. If you are not the intended recipient of this message, any use, 
disclosure, 

Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Walter Underwood
1. I never saw the poll.

2. It looks better than the previous poll, which was poorly worded. I couldn’t 
answer “yes” or “no”, really.

Here is what we have in production.

Solr 3: Using every threat I can think of to get the remaining clients off of 
it. It has been shut down in test for months.

Solr 4 master/slave: Main cluster for smallish (under 1Mdoc) collections with 
daily updates, plus one that needs to move to…

Solr 6 cloud: Hosts one small collection with strong freshness requirements and 
one large collection with very difficult queries. The second is mid-transition 
from the Solr 4 cluster.

There is no reason to go to Solr Cloud for a moderate size collection with 
daily update. None. The loose coupling makes scaling out trivial, just spin up 
an exact duplicate of an existing slave. No ADDREPLICA commands or trying to 
understand how core names are mapped to node names and then to host names 
(drives me nuts). Same thing for scaling back, take it out of the load balancer 
and shoot it.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 25, 2017, at 9:23 AM, Erick Erickson  wrote:
> 
> Maybe the other thing in play here is that use-cases that "just work"
> in the master/slave environment are less likely to employ consultants
> so we get something of a skewed sense of who uses what ;)
> 
> On Tue, Apr 25, 2017 at 1:50 AM, Charlie Hull  wrote:
>> On 24/04/2017 15:58, Otis Gospodnetić wrote:
>>> 
>>> Hi,
>>> 
>>> I'm really really surprised here.  Back in 2013 we did a poll to see how
>>> people were running Master-Slave (4.x back then) and SolrCloud was a bit
>>> more popular than Master-Slave:
>>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>> 
>>> Here is a fresh new poll with pretty much the same question - How do you
>>> run your Solr?  -
>>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>>> Master-Slave.
>>> 
>>> We definitely see a lot more SolrCloud used by Sematext Solr
>>> consulting/support customers, so I'm a bit surprised by the results of
>>> this
>>> poll so far.
>> 
>> 
>> I'm not particularly surprised. We regularly see clients either with single
>> nodes or elderly versions of Solr (or even Lucene). Zookeeper is still seen
>> as a bit of a black art. Once you move from 'how do I run a search engine'
>> to 'how do I manage a cluster of servers with scaling for
>> performance/resilience/failover' you're looking at a completely new set of
>> skills and challenges, which I think puts many people off.
>> 
>> Charlie
>>> 
>>> 
>>> Is anyone else surprised by this?  See https://twitter.com/sematext/
>>> status/854927627748036608
>>> 
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>> 
>>> 
>>> ---
>>> This email has been checked for viruses by AVG.
>>> http://www.avg.com
>>> 
>> 
>> 
>> --
>> Charlie Hull
>> Flax - Open Source Enterprise Search
>> 
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.flax.co.uk



Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Otis Gospodnetić
Hi Erick,

Could one run *only* embedded ZK on some SolrCloud nodes, sans any data?
It would be equivalent of dedicated Elasticsearch nodes, which is the
current ES best practice/recommendation.  I've never heard of anyone being
scared of running 3 dedicated master ES nodes, so if SolrCloud offered the
same, perhaps even completely hiding ZK from users, that would present the
same level of complexity (err, simplicity) ES users love about ES.  Don't
want to talk about SolrCloud vs. ES here at all, just trying to share
observations since we work a lot with both Elasticsearch and Solr(Cloud) at
Sematext.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:03 PM, Erick Erickson 
wrote:

> bq: I read somewhere that you should run your own ZK externally, and
> turn off SolrCloud
>
> this is a bit confused. "turn off SolrCloud" has nothing to do with
> running ZK internally or externally. SolrCloud requires ZK, whether
> internal or external is irrelevant to the term SolrCloud.
>
> On to running an external ZK ensemble. Mostly, that's administratively
> by far the safest. If you're running the embedded ZK, then the ZK
> instances are tied to your Solr instance. Now if, for any reason, your
> Solr nodes hosting ZK go down, you lose ZK quorum, can't index.
> etc
>
> Now consider a cluster with, say, 100 Solr nodes. Not talking replicas
> in a collection here, I'm talking 100 physical machines. BTW, this is
> not even close to the largest ones I'm aware of. Which three (for
> example) are running ZK? If I want to upgrade Solr I better make
> really sure not to upgrade to of the Solr instances running ZK at once
> if I want my cluster to keep going
>
> And, ZK is sensitive to system resources. So putting ZK on a Solr node
> then hosing, say, updates to my Solr cluster can cause ZK to be
> starved for resources.
>
> This is one of those deals where _functionally_, it's OK to run
> embedded ZK, but administratively it's suspect.
>
> Best,
> Erick
>
> On Tue, Apr 25, 2017 at 10:49 AM, Rick Leir  wrote:
> > All,
> > I read somewhere that you should run your own ZK externally, and turn
> off SolrCloud. Comments please!
> > Rick
> >
> > On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" <
> otis.gospodne...@gmail.com> wrote:
> >>This is interesting - that ZK is seen as adding so much complexity that
> >>it
> >>turns people off!
> >>
> >>If you think about it, Elasticsearch users have no choice -- except
> >>their
> >>"ZK" is built-in, hidden, so one doesn't have to think about it, at
> >>least
> >>not initially.
> >>
> >>I think I saw mentions (maybe on user or dev MLs or JIRA) about
> >>potentially, in the future, there only being SolrCloud mode (and
> >>dropping
> >>SolrCloud name in favour of Solr).  If the above comment from Charlie
> >>about
> >>complexity is really true for Solr users, and if that's the reason why
> >>we
> >>see so few people running SolrCloud today, perhaps that's a good signal
> >>for
> >>Solr development/priorities in terms of ZK
> >>hiding/automating/embedding/something...
> >>
> >>Otis
> >>--
> >>Monitoring - Log Management - Alerting - Anomaly Detection
> >>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull 
> >>wrote:
> >>
> >>> On 24/04/2017 15:58, Otis Gospodnetić wrote:
> >>>
>  Hi,
> 
>  I'm really really surprised here.  Back in 2013 we did a poll to see
> >>how
>  people were running Master-Slave (4.x back then) and SolrCloud was a
> >>bit
>  more popular than Master-Slave:
>  https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
> 
>  Here is a fresh new poll with pretty much the same question - How do
> >>you
>  run your Solr?
> >> -
>  and guess what?  SolrCloud is *not* at all a lot more prevalent than
>  Master-Slave.
> 
>  We definitely see a lot more SolrCloud used by Sematext Solr
>  consulting/support customers, so I'm a bit surprised by the results
> >>of
>  this
>  poll so far.
> 
> >>>
> >>> I'm not particularly surprised. We regularly see clients either with
> >>> single nodes or elderly versions of Solr (or even Lucene). Zookeeper
> >>is
> >>> still seen as a bit of a black art. Once you move from 'how do I run
> >>a
> >>> search engine' to 'how do I manage a cluster of servers with scaling
> >>for
> >>> performance/resilience/failover' you're looking at a completely new
> >>set
> >>> of skills and challenges, which I think puts many people off.
> >>>
> >>> Charlie
> >>>
> 
>  Is anyone else surprised by this?  See https://twitter.com/sematext/
>  status/854927627748036608
> 
>  Thanks,
>  Otis
>  --
>  Monitoring - Log Management - Alerting - 

Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Erick Erickson
bq: I read somewhere that you should run your own ZK externally, and
turn off SolrCloud

this is a bit confused. "turn off SolrCloud" has nothing to do with
running ZK internally or externally. SolrCloud requires ZK, whether
internal or external is irrelevant to the term SolrCloud.

On to running an external ZK ensemble. Mostly, that's administratively
by far the safest. If you're running the embedded ZK, then the ZK
instances are tied to your Solr instance. Now if, for any reason, your
Solr nodes hosting ZK go down, you lose ZK quorum, can't index.
etc

Now consider a cluster with, say, 100 Solr nodes. Not talking replicas
in a collection here, I'm talking 100 physical machines. BTW, this is
not even close to the largest ones I'm aware of. Which three (for
example) are running ZK? If I want to upgrade Solr I better make
really sure not to upgrade to of the Solr instances running ZK at once
if I want my cluster to keep going

And, ZK is sensitive to system resources. So putting ZK on a Solr node
then hosing, say, updates to my Solr cluster can cause ZK to be
starved for resources.

This is one of those deals where _functionally_, it's OK to run
embedded ZK, but administratively it's suspect.

Best,
Erick

On Tue, Apr 25, 2017 at 10:49 AM, Rick Leir  wrote:
> All,
> I read somewhere that you should run your own ZK externally, and turn off 
> SolrCloud. Comments please!
> Rick
>
> On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" 
>  wrote:
>>This is interesting - that ZK is seen as adding so much complexity that
>>it
>>turns people off!
>>
>>If you think about it, Elasticsearch users have no choice -- except
>>their
>>"ZK" is built-in, hidden, so one doesn't have to think about it, at
>>least
>>not initially.
>>
>>I think I saw mentions (maybe on user or dev MLs or JIRA) about
>>potentially, in the future, there only being SolrCloud mode (and
>>dropping
>>SolrCloud name in favour of Solr).  If the above comment from Charlie
>>about
>>complexity is really true for Solr users, and if that's the reason why
>>we
>>see so few people running SolrCloud today, perhaps that's a good signal
>>for
>>Solr development/priorities in terms of ZK
>>hiding/automating/embedding/something...
>>
>>Otis
>>--
>>Monitoring - Log Management - Alerting - Anomaly Detection
>>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull 
>>wrote:
>>
>>> On 24/04/2017 15:58, Otis Gospodnetić wrote:
>>>
 Hi,

 I'm really really surprised here.  Back in 2013 we did a poll to see
>>how
 people were running Master-Slave (4.x back then) and SolrCloud was a
>>bit
 more popular than Master-Slave:
 https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/

 Here is a fresh new poll with pretty much the same question - How do
>>you
 run your Solr?
>> -
 and guess what?  SolrCloud is *not* at all a lot more prevalent than
 Master-Slave.

 We definitely see a lot more SolrCloud used by Sematext Solr
 consulting/support customers, so I'm a bit surprised by the results
>>of
 this
 poll so far.

>>>
>>> I'm not particularly surprised. We regularly see clients either with
>>> single nodes or elderly versions of Solr (or even Lucene). Zookeeper
>>is
>>> still seen as a bit of a black art. Once you move from 'how do I run
>>a
>>> search engine' to 'how do I manage a cluster of servers with scaling
>>for
>>> performance/resilience/failover' you're looking at a completely new
>>set
>>> of skills and challenges, which I think puts many people off.
>>>
>>> Charlie
>>>

 Is anyone else surprised by this?  See https://twitter.com/sematext/
 status/854927627748036608

 Thanks,
 Otis
 --
 Monitoring - Log Management - Alerting - Anomaly Detection
 Solr & Elasticsearch Consulting Support Training -
>>http://sematext.com/


 ---
 This email has been checked for viruses by AVG.
 http://www.avg.com


>>>
>>> --
>>> Charlie Hull
>>> Flax - Open Source Enterprise Search
>>>
>>> tel/fax: +44 (0)8700 118334
>>> mobile:  +44 (0)7767 825828
>>> web: www.flax.co.uk
>>>
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com


Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread David Hastings
I can definitely attest to this.  The really nice thing about the standard
Solr/Jetty configuration is that its all there, Lucene+Solr+Jetty and you
just turn it on and run, and after only minor tweaks to JVM and memory
settings, its effectively production ready with a reliable master- slave
configuration.  The servers I run do about 30,000 + searches a day 95% are
sub second on massive indexes.  With SolrCloud and ZK, it does work out of
the box, but explicitly said every where its not supposed to be in
production until you configure and maintain your own external ZK ensemble.
  If it was simplified some what, I think a lot more people would migrate
over to SolrCloud, but for now I can say we are not going in that direction.

On Tue, Apr 25, 2017 at 1:33 PM, Otis Gospodnetić <
otis.gospodne...@gmail.com> wrote:

> This is interesting - that ZK is seen as adding so much complexity that it
> turns people off!
>
> If you think about it, Elasticsearch users have no choice -- except their
> "ZK" is built-in, hidden, so one doesn't have to think about it, at least
> not initially.
>
> I think I saw mentions (maybe on user or dev MLs or JIRA) about
> potentially, in the future, there only being SolrCloud mode (and dropping
> SolrCloud name in favour of Solr).  If the above comment from Charlie about
> complexity is really true for Solr users, and if that's the reason why we
> see so few people running SolrCloud today, perhaps that's a good signal for
> Solr development/priorities in terms of ZK
> hiding/automating/embedding/something...
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull  wrote:
>
> > On 24/04/2017 15:58, Otis Gospodnetić wrote:
> >
> >> Hi,
> >>
> >> I'm really really surprised here.  Back in 2013 we did a poll to see how
> >> people were running Master-Slave (4.x back then) and SolrCloud was a bit
> >> more popular than Master-Slave:
> >> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
> >>
> >> Here is a fresh new poll with pretty much the same question - How do you
> >> run your Solr? 
> -
> >> and guess what?  SolrCloud is *not* at all a lot more prevalent than
> >> Master-Slave.
> >>
> >> We definitely see a lot more SolrCloud used by Sematext Solr
> >> consulting/support customers, so I'm a bit surprised by the results of
> >> this
> >> poll so far.
> >>
> >
> > I'm not particularly surprised. We regularly see clients either with
> > single nodes or elderly versions of Solr (or even Lucene). Zookeeper is
> > still seen as a bit of a black art. Once you move from 'how do I run a
> > search engine' to 'how do I manage a cluster of servers with scaling for
> > performance/resilience/failover' you're looking at a completely new set
> > of skills and challenges, which I think puts many people off.
> >
> > Charlie
> >
> >>
> >> Is anyone else surprised by this?  See https://twitter.com/sematext/
> >> status/854927627748036608
> >>
> >> Thanks,
> >> Otis
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >> http://www.avg.com
> >>
> >>
> >
> > --
> > Charlie Hull
> > Flax - Open Source Enterprise Search
> >
> > tel/fax: +44 (0)8700 118334
> > mobile:  +44 (0)7767 825828
> > web: www.flax.co.uk
> >
>


Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Rick Leir
All,
I read somewhere that you should run your own ZK externally, and turn off 
SolrCloud. Comments please!
Rick

On April 25, 2017 1:33:31 PM EDT, "Otis Gospodnetić" 
 wrote:
>This is interesting - that ZK is seen as adding so much complexity that
>it
>turns people off!
>
>If you think about it, Elasticsearch users have no choice -- except
>their
>"ZK" is built-in, hidden, so one doesn't have to think about it, at
>least
>not initially.
>
>I think I saw mentions (maybe on user or dev MLs or JIRA) about
>potentially, in the future, there only being SolrCloud mode (and
>dropping
>SolrCloud name in favour of Solr).  If the above comment from Charlie
>about
>complexity is really true for Solr users, and if that's the reason why
>we
>see so few people running SolrCloud today, perhaps that's a good signal
>for
>Solr development/priorities in terms of ZK
>hiding/automating/embedding/something...
>
>Otis
>--
>Monitoring - Log Management - Alerting - Anomaly Detection
>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull 
>wrote:
>
>> On 24/04/2017 15:58, Otis Gospodnetić wrote:
>>
>>> Hi,
>>>
>>> I'm really really surprised here.  Back in 2013 we did a poll to see
>how
>>> people were running Master-Slave (4.x back then) and SolrCloud was a
>bit
>>> more popular than Master-Slave:
>>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>>
>>> Here is a fresh new poll with pretty much the same question - How do
>you
>>> run your Solr?
> -
>>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>>> Master-Slave.
>>>
>>> We definitely see a lot more SolrCloud used by Sematext Solr
>>> consulting/support customers, so I'm a bit surprised by the results
>of
>>> this
>>> poll so far.
>>>
>>
>> I'm not particularly surprised. We regularly see clients either with
>> single nodes or elderly versions of Solr (or even Lucene). Zookeeper
>is
>> still seen as a bit of a black art. Once you move from 'how do I run
>a
>> search engine' to 'how do I manage a cluster of servers with scaling
>for
>> performance/resilience/failover' you're looking at a completely new
>set
>> of skills and challenges, which I think puts many people off.
>>
>> Charlie
>>
>>>
>>> Is anyone else surprised by this?  See https://twitter.com/sematext/
>>> status/854927627748036608
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training -
>http://sematext.com/
>>>
>>>
>>> ---
>>> This email has been checked for viruses by AVG.
>>> http://www.avg.com
>>>
>>>
>>
>> --
>> Charlie Hull
>> Flax - Open Source Enterprise Search
>>
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.flax.co.uk
>>

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Otis Gospodnetić
This is interesting - that ZK is seen as adding so much complexity that it
turns people off!

If you think about it, Elasticsearch users have no choice -- except their
"ZK" is built-in, hidden, so one doesn't have to think about it, at least
not initially.

I think I saw mentions (maybe on user or dev MLs or JIRA) about
potentially, in the future, there only being SolrCloud mode (and dropping
SolrCloud name in favour of Solr).  If the above comment from Charlie about
complexity is really true for Solr users, and if that's the reason why we
see so few people running SolrCloud today, perhaps that's a good signal for
Solr development/priorities in terms of ZK
hiding/automating/embedding/something...

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 25, 2017 at 4:50 AM, Charlie Hull  wrote:

> On 24/04/2017 15:58, Otis Gospodnetić wrote:
>
>> Hi,
>>
>> I'm really really surprised here.  Back in 2013 we did a poll to see how
>> people were running Master-Slave (4.x back then) and SolrCloud was a bit
>> more popular than Master-Slave:
>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>
>> Here is a fresh new poll with pretty much the same question - How do you
>> run your Solr?  -
>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>> Master-Slave.
>>
>> We definitely see a lot more SolrCloud used by Sematext Solr
>> consulting/support customers, so I'm a bit surprised by the results of
>> this
>> poll so far.
>>
>
> I'm not particularly surprised. We regularly see clients either with
> single nodes or elderly versions of Solr (or even Lucene). Zookeeper is
> still seen as a bit of a black art. Once you move from 'how do I run a
> search engine' to 'how do I manage a cluster of servers with scaling for
> performance/resilience/failover' you're looking at a completely new set
> of skills and challenges, which I think puts many people off.
>
> Charlie
>
>>
>> Is anyone else surprised by this?  See https://twitter.com/sematext/
>> status/854927627748036608
>>
>> Thanks,
>> Otis
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>


Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Sales

> On Apr 25, 2017, at 11:23 AM, Erick Erickson  wrote:
> 
> Maybe the other thing in play here is that use-cases that "just work"
> in the master/slave environment are less likely to employ consultants
> so we get something of a skewed sense of who uses what ;)
> 

So, there’s a new poll you just started there! That exactly describes us. 

Steve



Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Erick Erickson
Maybe the other thing in play here is that use-cases that "just work"
in the master/slave environment are less likely to employ consultants
so we get something of a skewed sense of who uses what ;)

On Tue, Apr 25, 2017 at 1:50 AM, Charlie Hull  wrote:
> On 24/04/2017 15:58, Otis Gospodnetić wrote:
>>
>> Hi,
>>
>> I'm really really surprised here.  Back in 2013 we did a poll to see how
>> people were running Master-Slave (4.x back then) and SolrCloud was a bit
>> more popular than Master-Slave:
>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>
>> Here is a fresh new poll with pretty much the same question - How do you
>> run your Solr?  -
>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>> Master-Slave.
>>
>> We definitely see a lot more SolrCloud used by Sematext Solr
>> consulting/support customers, so I'm a bit surprised by the results of
>> this
>> poll so far.
>
>
> I'm not particularly surprised. We regularly see clients either with single
> nodes or elderly versions of Solr (or even Lucene). Zookeeper is still seen
> as a bit of a black art. Once you move from 'how do I run a search engine'
> to 'how do I manage a cluster of servers with scaling for
> performance/resilience/failover' you're looking at a completely new set of
> skills and challenges, which I think puts many people off.
>
> Charlie
>>
>>
>> Is anyone else surprised by this?  See https://twitter.com/sematext/
>> status/854927627748036608
>>
>> Thanks,
>> Otis
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk


Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Charlie Hull

On 24/04/2017 15:58, Otis Gospodnetić wrote:

Hi,

I'm really really surprised here.  Back in 2013 we did a poll to see how
people were running Master-Slave (4.x back then) and SolrCloud was a bit
more popular than Master-Slave:
https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/

Here is a fresh new poll with pretty much the same question - How do you
run your Solr?  -
and guess what?  SolrCloud is *not* at all a lot more prevalent than
Master-Slave.

We definitely see a lot more SolrCloud used by Sematext Solr
consulting/support customers, so I'm a bit surprised by the results of this
poll so far.


I'm not particularly surprised. We regularly see clients either with 
single nodes or elderly versions of Solr (or even Lucene). Zookeeper is 
still seen as a bit of a black art. Once you move from 'how do I run a 
search engine' to 'how do I manage a cluster of servers with scaling for 
performance/resilience/failover' you're looking at a completely new set 
of skills and challenges, which I think puts many people off.


Charlie


Is anyone else surprised by this?  See https://twitter.com/sematext/
status/854927627748036608

Thanks,
Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


---
This email has been checked for viruses by AVG.
http://www.avg.com




--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: Poll: Master-Slave or SolrCloud?

2017-04-25 Thread Bernd Fehling
Hi,

bq: What amazes me that in 2017 we don't see a lot more SolrCloud users!

Really? SolrCloud is much more complex. All of a sudden you have to
deal with zookeeper which brings a new level of complexity into play
where you only want do have some data stored and searchable.
The easyness of single index with master-slave is gone.

Regards
Bernd


Am 25.04.2017 um 04:29 schrieb Otis Gospodnetić:
> Hi,
> 
> I think it's roughly the same profile of people.  The poll from 2013 was on
> Sematext blog and the new one is on Sematext Twitter account.  But it
> doesn't really matter so much whether people are the same or not.  What
> amazes me that in 2017 we don't see a lot more SolrCloud users!
> 
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> On Mon, Apr 24, 2017 at 8:04 PM, Erick Erickson 
> wrote:
> 
>> Yeah, this is kind of counter to my expectations too. I guess my
>> question is whether the same people are responding to the new survey
>> as the old one. "If it ain't broke" and all that.
>>
>> Erick
>>
>> On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
>>  wrote:
>>> Hi,
>>>
>>> I'm really really surprised here.  Back in 2013 we did a poll to see how
>>> people were running Master-Slave (4.x back then) and SolrCloud was a bit
>>> more popular than Master-Slave:
>>> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>>>
>>> Here is a fresh new poll with pretty much the same question - How do you
>>> run your Solr? 
>> -
>>> and guess what?  SolrCloud is *not* at all a lot more prevalent than
>>> Master-Slave.
>>>
>>> We definitely see a lot more SolrCloud used by Sematext Solr
>>> consulting/support customers, so I'm a bit surprised by the results of
>> this
>>> poll so far.
>>>
>>> Is anyone else surprised by this?  See https://twitter.com/sematext/
>>> status/854927627748036608
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
> 


Re: Poll: Master-Slave or SolrCloud?

2017-04-24 Thread Erick Erickson
Otis:

bq: But it doesn't really matter so much whether people are the same or not

I'm going to gently disagree here. I regularly see questions on the
user's list about upgrading from 4.x or 3.x (!). So if the sample of
users responding to your poll are substantially the same users as
responded in 2013, there's no guarantee that they've even upgraded
Solr, much less thought it worthwhile to change their paradigm.

I suppose an interesting bit of additional data would be "when did you
start using Solr?". Would there be a greater percentage of responders
using SolrCloud in 2014 .vs. 2013? 2015 .vs. 2014? and so on.

Mind you I have zero data to support any of this, it's speculation and
I haven't looked at the poll so maybe I'm off base

Erick

On Mon, Apr 24, 2017 at 7:29 PM, Otis Gospodnetić
 wrote:
> Hi,
>
> I think it's roughly the same profile of people.  The poll from 2013 was on
> Sematext blog and the new one is on Sematext Twitter account.  But it
> doesn't really matter so much whether people are the same or not.  What
> amazes me that in 2017 we don't see a lot more SolrCloud users!
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Mon, Apr 24, 2017 at 8:04 PM, Erick Erickson 
> wrote:
>
>> Yeah, this is kind of counter to my expectations too. I guess my
>> question is whether the same people are responding to the new survey
>> as the old one. "If it ain't broke" and all that.
>>
>> Erick
>>
>> On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
>>  wrote:
>> > Hi,
>> >
>> > I'm really really surprised here.  Back in 2013 we did a poll to see how
>> > people were running Master-Slave (4.x back then) and SolrCloud was a bit
>> > more popular than Master-Slave:
>> > https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>> >
>> > Here is a fresh new poll with pretty much the same question - How do you
>> > run your Solr? 
>> -
>> > and guess what?  SolrCloud is *not* at all a lot more prevalent than
>> > Master-Slave.
>> >
>> > We definitely see a lot more SolrCloud used by Sematext Solr
>> > consulting/support customers, so I'm a bit surprised by the results of
>> this
>> > poll so far.
>> >
>> > Is anyone else surprised by this?  See https://twitter.com/sematext/
>> > status/854927627748036608
>> >
>> > Thanks,
>> > Otis
>> > --
>> > Monitoring - Log Management - Alerting - Anomaly Detection
>> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>


Re: Poll: Master-Slave or SolrCloud?

2017-04-24 Thread Otis Gospodnetić
Hi,

I think it's roughly the same profile of people.  The poll from 2013 was on
Sematext blog and the new one is on Sematext Twitter account.  But it
doesn't really matter so much whether people are the same or not.  What
amazes me that in 2017 we don't see a lot more SolrCloud users!

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Mon, Apr 24, 2017 at 8:04 PM, Erick Erickson 
wrote:

> Yeah, this is kind of counter to my expectations too. I guess my
> question is whether the same people are responding to the new survey
> as the old one. "If it ain't broke" and all that.
>
> Erick
>
> On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
>  wrote:
> > Hi,
> >
> > I'm really really surprised here.  Back in 2013 we did a poll to see how
> > people were running Master-Slave (4.x back then) and SolrCloud was a bit
> > more popular than Master-Slave:
> > https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
> >
> > Here is a fresh new poll with pretty much the same question - How do you
> > run your Solr? 
> -
> > and guess what?  SolrCloud is *not* at all a lot more prevalent than
> > Master-Slave.
> >
> > We definitely see a lot more SolrCloud used by Sematext Solr
> > consulting/support customers, so I'm a bit surprised by the results of
> this
> > poll so far.
> >
> > Is anyone else surprised by this?  See https://twitter.com/sematext/
> > status/854927627748036608
> >
> > Thanks,
> > Otis
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>


Re: Poll: Master-Slave or SolrCloud?

2017-04-24 Thread Erick Erickson
Yeah, this is kind of counter to my expectations too. I guess my
question is whether the same people are responding to the new survey
as the old one. "If it ain't broke" and all that.

Erick

On Mon, Apr 24, 2017 at 7:58 AM, Otis Gospodnetić
 wrote:
> Hi,
>
> I'm really really surprised here.  Back in 2013 we did a poll to see how
> people were running Master-Slave (4.x back then) and SolrCloud was a bit
> more popular than Master-Slave:
> https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/
>
> Here is a fresh new poll with pretty much the same question - How do you
> run your Solr?  -
> and guess what?  SolrCloud is *not* at all a lot more prevalent than
> Master-Slave.
>
> We definitely see a lot more SolrCloud used by Sematext Solr
> consulting/support customers, so I'm a bit surprised by the results of this
> poll so far.
>
> Is anyone else surprised by this?  See https://twitter.com/sematext/
> status/854927627748036608
>
> Thanks,
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/


Poll: Master-Slave or SolrCloud?

2017-04-24 Thread Otis Gospodnetić
Hi,

I'm really really surprised here.  Back in 2013 we did a poll to see how
people were running Master-Slave (4.x back then) and SolrCloud was a bit
more popular than Master-Slave:
https://sematext.com/blog/2013/02/25/poll-solr-cloud-or-not/

Here is a fresh new poll with pretty much the same question - How do you
run your Solr?  -
and guess what?  SolrCloud is *not* at all a lot more prevalent than
Master-Slave.

We definitely see a lot more SolrCloud used by Sematext Solr
consulting/support customers, so I'm a bit surprised by the results of this
poll so far.

Is anyone else surprised by this?  See https://twitter.com/sematext/
status/854927627748036608

Thanks,
Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/