Solr 8.7.0 in Cloud mode with Zookeeper 3.4.5 cdh 5.16

2021-01-12 Thread Subhajit Das

Hi,

We are planning to implement Solr Cloud 8.7.0, running in Kubernetes cluster, 
with external Zookeeper  3.4.5 cdh 5.16.
Solr 8.7.0 seems to be matched with Zookeeper 3.6.2. Is there any issue using 
Zookeeper  3.4.5 cdh 5.16?

Thanks in advance.

Regards,
Subhajit



Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-12-09 Thread Ritvik Sharma
This code is there but it does not show on solr running cammnd

On Wed, 9 Dec 2020 at 23:28, rkrish84  wrote:

> Commented out the solr_ssl_client_key_store related code section in solr.sh
> file to resolve the issue and enable ssl.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-12-09 Thread rkrish84
Commented out the solr_ssl_client_key_store related code section in solr.sh
file to resolve the issue and enable ssl.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Natarajan, Rajeswari
gt; > >
> > > In local with just certificate and one domain name  the SSL
> > communication
> > > worked. With multiple DNS and 2 certificates SSL fails with
> below
> > exception.
> > >
> >
> > A client keystore by definition can only have a single
> certificate. A
> > server keystore can have multiple certificates. The reason being
> is
> > that a
> > client can only be identified by a single certificate.
> >
> > Can you share more details about specifically what your
> solr.in.sh
> > configs
> > look like related to keystore/truststore and which files?
> Specifically
> > highlight which files have multiple certificates in them.
> >
> > It looks like for the Solr internal http client, the client
> keystore
> > has
> > more than one certificate in it and the error is correct. This
> is more
> > strict with recent versions of Jetty 9.4.x. Previously this 
would
> > silently
> > fail, but was still incorrect. Now the error is bubbled up so
> that
> > there is
> > no silent misconfigurations.
> >
> > Kevin Risden
> >
> >
> > On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
> > rajeswari.natara...@sap.com> wrote:
> >
> > > I looked at the patch mentioned in the JIRA
> > > https://issues.apache.org/jira/browse/SOLR-14105  reporting
> the
> > below
> > > issue. I looked at the solr 8.5.1 code base , I see the patch
> is
> > applied.
> > > But still seeing the same  exception with different stack
> trace. The
> > > initial excsption stacktrace was at
> > >
> > > at
> > >
> >
> 
org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> > >
> > >
> > > Now the exception we encounter is at httpsolrclient creation
> > >
> > >
> > > Caused by: java.lang.RuntimeException:
> > > java.lang.UnsupportedOperationException:
> X509ExtendedKeyManager only
> > > supported on Server
> > >   at
> > >
> >
> 
org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> > >
> > > I commented the JIRA also. Let me know if this is still an
> issue.
> > >
> > > Thanks,
> > > Rajeswari
> > >
> > > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com>
> > > wrote:
> > >
> > > Re-sending to see if anyone encountered  had this
> combination and
> > > encountered this issue. In local with just certificate and one
> > domain name
> > > the SSL communication worked. With multiple DNS and 2
> certificates
> > SSL
> > > fails with below exception.  Below JIRA says it is fixed for
> > > Http2SolrClient , wondering if this is fixed for http1 solr
> client
> > as we
> > > pass -Dsolr.http1=true .
> > >
> > > Thanks,
> > > Rajeswari
> > >
> > > https://issues.apache.org/jira/browse/SOLR-14105
> > >
> > > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > > rajeswari.natara...@sap.com> wrote:
> > >
> > > Hi,
> > >
> > > We are using Solr 8.5.1 in cloud mode  with Java 8. We
> are
> > > enabling  TLS  with http1  (as we get a warning java 8 + solr
> 8.5
> > SSL can’t
> > > be enabled) and we get below exception
> > >
> > >
> > >
> > > 2020-07-07 03:58:

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Kevin Risden
; look like related to keystore/truststore and which files?
> Specifically
> > highlight which files have multiple certificates in them.
> >
> > It looks like for the Solr internal http client, the client
> keystore
> > has
> > more than one certificate in it and the error is correct. This
> is more
> > strict with recent versions of Jetty 9.4.x. Previously this would
> > silently
> > fail, but was still incorrect. Now the error is bubbled up so
> that
> > there is
> > no silent misconfigurations.
> >
> > Kevin Risden
> >
> >
> > On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
> > rajeswari.natara...@sap.com> wrote:
> >
> > > I looked at the patch mentioned in the JIRA
> > > https://issues.apache.org/jira/browse/SOLR-14105  reporting
> the
> > below
> > > issue. I looked at the solr 8.5.1 code base , I see the patch
> is
> > applied.
> > > But still seeing the same  exception with different stack
> trace. The
> > > initial excsption stacktrace was at
> > >
> > > at
> > >
> >
> org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> > >
> > >
> > > Now the exception we encounter is at httpsolrclient creation
> > >
> > >
> > > Caused by: java.lang.RuntimeException:
> > > java.lang.UnsupportedOperationException:
> X509ExtendedKeyManager only
> > > supported on Server
> > >   at
> > >
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> > >
> > > I commented the JIRA also. Let me know if this is still an
> issue.
> > >
> > > Thanks,
> > > Rajeswari
> > >
> > > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com>
> > > wrote:
> > >
> > > Re-sending to see if anyone encountered  had this
> combination and
> > > encountered this issue. In local with just certificate and one
> > domain name
> > > the SSL communication worked. With multiple DNS and 2
> certificates
> > SSL
> > > fails with below exception.  Below JIRA says it is fixed for
> > > Http2SolrClient , wondering if this is fixed for http1 solr
> client
> > as we
> > > pass -Dsolr.http1=true .
> > >
> > > Thanks,
> > > Rajeswari
> > >
> > > https://issues.apache.org/jira/browse/SOLR-14105
> > >
> > > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > > rajeswari.natara...@sap.com> wrote:
> > >
> > > Hi,
> > >
> > > We are using Solr 8.5.1 in cloud mode  with Java 8. We
> are
> > > enabling  TLS  with http1  (as we get a warning java 8 + solr
> 8.5
> > SSL can’t
> > > be enabled) and we get below exception
> > >
> > >
> > >
> > > 2020-07-07 03:58:53.078 ERROR (main) [   ]
> o.a.s.c.SolrCore
> > > null:org.apache.solr.common.SolrException: Error instantiating
> > > shardHandlerFactory class [HttpShardHandlerFactory]:
> > > java.lang.UnsupportedOperationException:
> X509ExtendedKeyManager only
> > > supported on Server
> > >   at
> > >
> >
> org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
> > >   at
> > > org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
> > >   at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
> > >   at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
> > >   at
> > >
> >
> org.eclipse.jetty.servlet.FilterHolder.initialize(

SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Natarajan, Rajeswari
ck trace. The
> > initial excsption stacktrace was at
> >
> > at
> >
> 
org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> >
> >
> > Now the exception we encounter is at httpsolrclient 
creation
> >
> >
> > Caused by: java.lang.RuntimeException:
> > java.lang.UnsupportedOperationException: 
X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> 
org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> >
> > I commented the JIRA also. Let me know if this is still 
an issue.
> >
> > Thanks,
    > > Rajeswari
> >
> > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com>
> > wrote:
> >
> > Re-sending to see if anyone encountered  had this 
combination and
> > encountered this issue. In local with just certificate 
and one
> domain name
> > the SSL communication worked. With multiple DNS and 2 
certificates
> SSL
> > fails with below exception.  Below JIRA says it is 
fixed for
> > Http2SolrClient , wondering if this is fixed for http1 
solr client
> as we
> > pass -Dsolr.http1=true .
> >
> > Thanks,
> > Rajeswari
> >
> > https://issues.apache.org/jira/browse/SOLR-14105
> >
> > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com> wrote:
> >
> > Hi,
> >
> > We are using Solr 8.5.1 in cloud mode  with 
Java 8. We are
> > enabling  TLS  with http1  (as we get a warning java 8 
+ solr 8.5
> SSL can’t
> > be enabled) and we get below exception
> >
> >
> >
> > 2020-07-07 03:58:53.078 ERROR (main) [   ] 
o.a.s.c.SolrCore
> > null:org.apache.solr.common.SolrException: Error 
instantiating
> > shardHandlerFactory class [HttpShardHandlerFactory]:
> > java.lang.UnsupportedOperationException: 
X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
> >   at
> > 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
> >   at
> >
> 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
> >   at
> >
> 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
> >   at
> >
> 
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
> >   at
> >
> 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
> >   at
> >
> 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> >   at
> >
> 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> >   at
> >
> 
java.util.stream.Streams$ConcatSpliterator.for

SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Natarajan, Rajeswari
> >
> > I commented the JIRA also. Let me know if this is still an 
issue.
> >
> > Thanks,
> > Rajeswari
> >
> > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com>
> > wrote:
> >
> > Re-sending to see if anyone encountered  had this 
combination and
> > encountered this issue. In local with just certificate and 
one
> domain name
> > the SSL communication worked. With multiple DNS and 2 
certificates
> SSL
> > fails with below exception.  Below JIRA says it is fixed for
> > Http2SolrClient , wondering if this is fixed for http1 solr 
client
> as we
> > pass -Dsolr.http1=true .
> >
> > Thanks,
> > Rajeswari
> >
> > https://issues.apache.org/jira/browse/SOLR-14105
> >
> > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com> wrote:
> >
> > Hi,
> >
> > We are using Solr 8.5.1 in cloud mode  with Java 8. 
We are
> > enabling  TLS  with http1  (as we get a warning java 8 + 
solr 8.5
> SSL can’t
> > be enabled) and we get below exception
> >
> >
> >
> > 2020-07-07 03:58:53.078 ERROR (main) [   ] 
o.a.s.c.SolrCore
> > null:org.apache.solr.common.SolrException: Error 
instantiating
> > shardHandlerFactory class [HttpShardHandlerFactory]:
> > java.lang.UnsupportedOperationException: 
X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
> >   at
> > 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
> >   at
> >
> 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
> >   at
> >
> 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
> >   at
> >
> 
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
> >   at
> >
> 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
> >   at
> >
> 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> >   at
> >
> 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> >   at
> >
> 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> >   at
> >
> 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
> >   at
> >
> 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
> >   at
> >
> 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
> >   at
> >
> 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
> >   at
> >
> 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
> >   at
> >
> 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
> >   at
> >
> 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
> >   at
> >
> 
or

Re: [CAUTION] Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Natarajan, Rajeswari
gt; >
>
> A client keystore by definition can only have a single 
certificate. A
> server keystore can have multiple certificates. The reason being 
is
> that a
> client can only be identified by a single certificate.
>
> Can you share more details about specifically what your solr.in.sh
> configs
> look like related to keystore/truststore and which files? 
Specifically
> highlight which files have multiple certificates in them.
>
> It looks like for the Solr internal http client, the client 
keystore
> has
> more than one certificate in it and the error is correct. This is 
more
> strict with recent versions of Jetty 9.4.x. Previously this would
> silently
> fail, but was still incorrect. Now the error is bubbled up so that
> there is
> no silent misconfigurations.
>
> Kevin Risden
>
>
> On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
> rajeswari.natara...@sap.com> wrote:
>
> > I looked at the patch mentioned in the JIRA
> > https://issues.apache.org/jira/browse/SOLR-14105  reporting the
> below
> > issue. I looked at the solr 8.5.1 code base , I see the patch is
> applied.
> > But still seeing the same  exception with different stack 
trace. The
> > initial excsption stacktrace was at
> >
> > at
> >
> 
org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> >
> >
> > Now the exception we encounter is at httpsolrclient creation
> >
> >
> > Caused by: java.lang.RuntimeException:
> > java.lang.UnsupportedOperationException: X509ExtendedKeyManager 
only
> > supported on Server
> >   at
> >
> 
org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> >
> > I commented the JIRA also. Let me know if this is still an 
issue.
> >
> > Thanks,
> > Rajeswari
> >
> > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com>
> > wrote:
> >
> > Re-sending to see if anyone encountered  had this 
combination and
> > encountered this issue. In local with just certificate and one
> domain name
> > the SSL communication worked. With multiple DNS and 2 
certificates
> SSL
> > fails with below exception.  Below JIRA says it is fixed for
> > Http2SolrClient , wondering if this is fixed for http1 solr 
client
> as we
> > pass -Dsolr.http1=true .
> >
> > Thanks,
> > Rajeswari
> >
> > https://issues.apache.org/jira/browse/SOLR-14105
> >
> > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com> wrote:
> >
> > Hi,
> >
> > We are using Solr 8.5.1 in cloud mode  with Java 8. We 
are
> > enabling  TLS  with http1  (as we get a warning java 8 + solr 
8.5
> SSL can’t
> > be enabled) and we get below exception
> >
> >
> >
> > 2020-07-07 03:58:53.078 ERROR (main) [   ] 
o.a.s.c.SolrCore
> > null:org.apache.solr.common.SolrException: Error instantiating
> > shardHandlerFactory class [HttpShardHandlerFactory]:
> > java.lang.UnsupportedOperationException: X509ExtendedKeyManager 
only
> > supported on Server
> >   at
> >
> 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
> >   at
> > org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
> >   at
> >
> 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
> >   at
>   

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-15 Thread Natarajan, Rajeswari
 #8: ObjectId: 2.5.29.14 Criticality=false
> SubjectKeyIdentifier [
> KeyIdentifier [
> : 3F 9D 3D 24 48 1E 61 3C   BD C0 A4 07 8B 64 51 0D  ?.=$H.a<.dQ.
> 0010: A2 B2 FE 89
> ]
> ]
>
> Certificate[2]:
> Owner: CN=SAP Ariba Cobalt Sidecar Intermediate CA, OU=COBALT, O=SAP
> Ariba, ST=CA, C=US
> Issuer: CN=SAP Ariba Cobalt CA, OU=ES, O=SAP Ariba, L=Palo Alto, ST=CA,
> C=US
> Serial number: 1001
> Valid from: Thu Apr 16 07:18:55 GMT 2020 until: Sun Apr 14 07:18:55 GMT
> 2030
> Certificate fingerprints:
>  MD5:  FA:70:2F:DB:63:36:66:71:A6:7B:0F:46:F3:52:0B:3C
>  SHA1: 4F:27:D3:E3:12:24:64:18:B5:97:D0:BF:94:37:2D:5C:33:EA:1E:40
>  SHA256:
> 
15:28:F4:DB:B3:D5:2E:21:6A:2E:56:47:E3:6B:D3:16:96:18:06:96:DA:5D:28:6B:34:CB:6D:FA:E8:FA:85:13
> Signature algorithm name: SHA256withRSA
> Subject Public Key Algorithm: 4096-bit RSA key
> Version: 3
>
> Extensions:
>
> #1: ObjectId: 2.5.29.35 Criticality=false
> AuthorityKeyIdentifier [
> KeyIdentifier [
> : D8 A1 D1 11 50 8C 1C 2A   67 69 82 40 DF B5 68 6A  P..*g...@..hj
> 0010: E4 97 6E 32..n2
> ]
> ]
>
> #2: ObjectId: 2.5.29.19 Criticality=true
> BasicConstraints:[
>   CA:true
>   PathLen:0
> ]
>
> #3: ObjectId: 2.5.29.15 Criticality=true
> KeyUsage [
>   DigitalSignature
>   Key_CertSign
>   Crl_Sign
> ]
>
> #4: ObjectId: 2.5.29.14 Criticality=false
> SubjectKeyIdentifier [
> KeyIdentifier [
> : E9 5C 42 72 5E 70 D9 02   05 AA 11 BA 0D 4D 8D 0D  .\Br^p...M..
> 0010: F3 37 2C 95.7,.
> ]
> ]
>
>
> Thanks,
> Rajeswari
>
> On 7/13/20, 2:16 PM, "Kevin Risden"  wrote:
>
> >
> > In local with just certificate and one domain name  the SSL
> communication
> > worked. With multiple DNS and 2 certificates SSL fails with below
> exception.
> >
>
> A client keystore by definition can only have a single certificate. A
> server keystore can have multiple certificates. The reason being is
> that a
> client can only be identified by a single certificate.
>
> Can you share more details about specifically what your solr.in.sh
> configs
> look like related to keystore/truststore and which files? Specifically
> highlight which files have multiple certificates in them.
>
> It looks like for the Solr internal http client, the client keystore
> has
> more than one certificate in it and the error is correct. This is more
> strict with recent versions of Jetty 9.4.x. Previously this would
> silently
> fail, but was still incorrect. Now the error is bubbled up so that
> there is
> no silent misconfigurations.
>
> Kevin Risden
>
>
> On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
> rajeswari.natara...@sap.com> wrote:
>
> > I looked at the patch mentioned in the JIRA
> > https://issues.apache.org/jira/browse/SOLR-14105  reporting the
> below
> > issue. I looked at the solr 8.5.1 code base , I see the patch is
> applied.
> > But still seeing the same  exception with different stack trace. The
> > initial excsption stacktrace was at
> >
> > at
> >
> 
org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> >
> >
> > Now the exception we encounter is at httpsolrclient creation
> >
> >
> > Caused by: java.lang.RuntimeException:
> > java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> 
org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> >
> > I commented the JIRA also. Let me know if this is still an issue.
> >
> > Thanks,
> > Rajeswari
> >
> > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com>
> > wrote:
> >
> > Re-sending to see if anyone encountered  had thi

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-14 Thread Kevin Risden
 02   05 AA 11 BA 0D 4D 8D 0D  .\Br^p...M..
> 0010: F3 37 2C 95.7,.
> ]
> ]
>
>
> Thanks,
> Rajeswari
>
> On 7/13/20, 2:16 PM, "Kevin Risden"  wrote:
>
> >
> > In local with just certificate and one domain name  the SSL
> communication
> > worked. With multiple DNS and 2 certificates SSL fails with below
> exception.
> >
>
> A client keystore by definition can only have a single certificate. A
> server keystore can have multiple certificates. The reason being is
> that a
> client can only be identified by a single certificate.
>
> Can you share more details about specifically what your solr.in.sh
> configs
> look like related to keystore/truststore and which files? Specifically
> highlight which files have multiple certificates in them.
>
> It looks like for the Solr internal http client, the client keystore
> has
> more than one certificate in it and the error is correct. This is more
> strict with recent versions of Jetty 9.4.x. Previously this would
> silently
> fail, but was still incorrect. Now the error is bubbled up so that
> there is
> no silent misconfigurations.
>
> Kevin Risden
>
>
> On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
> rajeswari.natara...@sap.com> wrote:
>
> > I looked at the patch mentioned in the JIRA
> > https://issues.apache.org/jira/browse/SOLR-14105  reporting the
> below
> > issue. I looked at the solr 8.5.1 code base , I see the patch is
> applied.
> > But still seeing the same  exception with different stack trace. The
> > initial excsption stacktrace was at
> >
> > at
> >
> org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> >
> >
> > Now the exception we encounter is at httpsolrclient creation
> >
> >
> > Caused by: java.lang.RuntimeException:
> > java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
> >
> > I commented the JIRA also. Let me know if this is still an issue.
> >
> > Thanks,
> > Rajeswari
> >
> > On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com>
> > wrote:
> >
> > Re-sending to see if anyone encountered  had this combination and
> > encountered this issue. In local with just certificate and one
> domain name
> > the SSL communication worked. With multiple DNS and 2 certificates
> SSL
> > fails with below exception.  Below JIRA says it is fixed for
> > Http2SolrClient , wondering if this is fixed for http1 solr client
> as we
> > pass -Dsolr.http1=true .
> >
> > Thanks,
> > Rajeswari
> >
> > https://issues.apache.org/jira/browse/SOLR-14105
> >
> > On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> > rajeswari.natara...@sap.com> wrote:
> >
> > Hi,
> >
> > We are using Solr 8.5.1 in cloud mode  with Java 8. We are
> > enabling  TLS  with http1  (as we get a warning java 8 + solr 8.5
> SSL can’t
> > be enabled) and we get below exception
> >
> >
> >
> > 2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore
> > null:org.apache.solr.common.SolrException: Error instantiating
> > shardHandlerFactory class [HttpShardHandlerFactory]:
> > java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> > supported on Server
> >   at
> >
> org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
> >   at
> > org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
> >   at
> >
> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
> >   at
> >
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
> >   at
> >
> org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
> >   at
> >
> org.eclipse.jetty.servlet.Servlet

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-13 Thread Natarajan, Rajeswari
 worked. With multiple DNS and 2 certificates SSL
> fails with below exception.  Below JIRA says it is fixed for
> Http2SolrClient , wondering if this is fixed for http1 solr client as we
> pass -Dsolr.http1=true .
>
> Thanks,
> Rajeswari
>
> https://issues.apache.org/jira/browse/SOLR-14105
    >
> On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com> wrote:
>
> Hi,
>
> We are using Solr 8.5.1 in cloud mode  with Java 8. We are
> enabling  TLS  with http1  (as we get a warning java 8 + solr 8.5 SSL 
can’t
> be enabled) and we get below exception
>
>
>
> 2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore
> null:org.apache.solr.common.SolrException: Error instantiating
> shardHandlerFactory class [HttpShardHandlerFactory]:
> java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> supported on Server
>   at
> 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
>   at
> org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
>   at
> 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
>   at
> 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
>   at
> org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
>   at
> 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
>   at
> 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at
> 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
>   at
> 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
>   at
> 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at
> 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
>   at
> 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
>   at
> 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
>   at
> 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
>   at
> 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
>   at
> 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
>   at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
>   at
> 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at
> 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46)
>   at
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
>   at
> 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:513)
>   at
> 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:154)
>   at
> 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:173)
>   at
> 
org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:447)
>   at
> 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:66)
>   at
> org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:784)
>   at
> org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:753)
>   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:641)
>   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:540)
>   at
> 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at
> 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:146)
>   at
> 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at
> 
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:599)
>   at
> 
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManag

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-13 Thread Kevin Risden
>
> In local with just certificate and one domain name  the SSL communication
> worked. With multiple DNS and 2 certificates SSL fails with below exception.
>

A client keystore by definition can only have a single certificate. A
server keystore can have multiple certificates. The reason being is that a
client can only be identified by a single certificate.

Can you share more details about specifically what your solr.in.sh configs
look like related to keystore/truststore and which files? Specifically
highlight which files have multiple certificates in them.

It looks like for the Solr internal http client, the client keystore has
more than one certificate in it and the error is correct. This is more
strict with recent versions of Jetty 9.4.x. Previously this would silently
fail, but was still incorrect. Now the error is bubbled up so that there is
no silent misconfigurations.

Kevin Risden


On Mon, Jul 13, 2020 at 4:54 PM Natarajan, Rajeswari <
rajeswari.natara...@sap.com> wrote:

> I looked at the patch mentioned in the JIRA
> https://issues.apache.org/jira/browse/SOLR-14105  reporting the below
> issue. I looked at the solr 8.5.1 code base , I see the patch is applied.
> But still seeing the same  exception with different stack trace. The
> initial excsption stacktrace was at
>
> at
> org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
>
>
> Now the exception we encounter is at httpsolrclient creation
>
>
> Caused by: java.lang.RuntimeException:
> java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> supported on Server
>   at
> org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)
>
> I commented the JIRA also. Let me know if this is still an issue.
>
> Thanks,
> Rajeswari
>
> On 7/13/20, 2:03 AM, "Natarajan, Rajeswari" 
> wrote:
>
> Re-sending to see if anyone encountered  had this combination and
> encountered this issue. In local with just certificate and one domain name
> the SSL communication worked. With multiple DNS and 2 certificates SSL
> fails with below exception.  Below JIRA says it is fixed for
> Http2SolrClient , wondering if this is fixed for http1 solr client as we
> pass -Dsolr.http1=true .
>
> Thanks,
> Rajeswari
>
> https://issues.apache.org/jira/browse/SOLR-14105
>
> On 7/6/20, 10:02 PM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com> wrote:
>
> Hi,
>
> We are using Solr 8.5.1 in cloud mode  with Java 8. We are
> enabling  TLS  with http1  (as we get a warning java 8 + solr 8.5 SSL can’t
> be enabled) and we get below exception
>
>
>
> 2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore
> null:org.apache.solr.common.SolrException: Error instantiating
> shardHandlerFactory class [HttpShardHandlerFactory]:
> java.lang.UnsupportedOperationException: X509ExtendedKeyManager only
> supported on Server
>   at
> org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
>   at
> org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
>   at
> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
>   at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
>   at
> org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
>   at
> org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
>   at
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
>   at
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
>   at
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
>   at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
>   at
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
>   at
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
>   at
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
>   at
> org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
>   at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:52

Re: [CAUTION] SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-13 Thread Natarajan, Rajeswari
I looked at the patch mentioned in the JIRA  
https://issues.apache.org/jira/browse/SOLR-14105  reporting the below issue. I 
looked at the solr 8.5.1 code base , I see the patch is applied. But still 
seeing the same  exception with different stack trace. The initial excsption 
stacktrace was at 

at 
org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)


Now the exception we encounter is at httpsolrclient creation


Caused by: java.lang.RuntimeException: 
java.lang.UnsupportedOperationException: X509ExtendedKeyManager only supported 
on Server
  at 
org.apache.solr.client.solrj.impl.Http2SolrClient.createHttpClient(Http2SolrClient.java:223)

I commented the JIRA also. Let me know if this is still an issue.

Thanks,
Rajeswari

On 7/13/20, 2:03 AM, "Natarajan, Rajeswari"  
wrote:

Re-sending to see if anyone encountered  had this combination and 
encountered this issue. In local with just certificate and one domain name  the 
SSL communication worked. With multiple DNS and 2 certificates SSL fails with 
below exception.  Below JIRA says it is fixed for Http2SolrClient , wondering 
if this is fixed for http1 solr client as we pass -Dsolr.http1=true .

Thanks,
Rajeswari

https://issues.apache.org/jira/browse/SOLR-14105

On 7/6/20, 10:02 PM, "Natarajan, Rajeswari"  
wrote:

Hi,

    We are using Solr 8.5.1 in cloud mode  with Java 8. We are enabling  
TLS  with http1  (as we get a warning java 8 + solr 8.5 SSL can’t be enabled) 
and we get below exception



2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore 
null:org.apache.solr.common.SolrException: Error instantiating 
shardHandlerFactory class [HttpShardHandlerFactory]: 
java.lang.UnsupportedOperationException: X509ExtendedKeyManager only supported 
on Server
  at 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
  at org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
  at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
  at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
  at 
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
  at 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
  at 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
  at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
  at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
  at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
  at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46)
  at 
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
  at 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:513)
  at 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:154)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:173)
  at 
org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:447)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:66)
  at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:784)
  at 
org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:753)
  at org.eclipse.jetty.util.Scanner.scan(Scanner.java:641)
  at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:540)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:146)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(Abstr

SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-13 Thread Natarajan, Rajeswari
Re-sending to see if anyone encountered  had this combination and encountered 
this issue. In local with just certificate and one domain name  the SSL 
communication worked. With multiple DNS and 2 certificates SSL fails with below 
exception.  Below JIRA says it is fixed for Http2SolrClient , wondering if this 
is fixed for http1 solr client as we pass -Dsolr.http1=true .

Thanks,
Rajeswari

https://issues.apache.org/jira/browse/SOLR-14105

On 7/6/20, 10:02 PM, "Natarajan, Rajeswari"  
wrote:

Hi,

We are using Solr 8.5.1 in cloud mode  with Java 8. We are enabling  TLS  
with http1  (as we get a warning java 8 + solr 8.5 SSL can’t be enabled) and we 
get below exception



2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore 
null:org.apache.solr.common.SolrException: Error instantiating 
shardHandlerFactory class [HttpShardHandlerFactory]: 
java.lang.UnsupportedOperationException: X509ExtendedKeyManager only supported 
on Server
  at 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
  at org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
  at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
  at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
  at 
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
  at 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
  at 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
  at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
  at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
  at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
  at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46)
  at 
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
  at 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:513)
  at 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:154)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:173)
  at 
org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:447)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:66)
  at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:784)
  at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:753)
  at org.eclipse.jetty.util.Scanner.scan(Scanner.java:641)
  at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:540)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:146)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:599)
  at 
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:249)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
  at org.eclipse.jetty.server.Server.start(Server.java:407)
  at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
  at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:100)
  at org.eclipse.jetty.server.Server.doStart(Server.java:371)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.xml.XmlConfiguration.lambda$main$0(XmlConfiguration.java:1888)
  at java.security.AccessController.doPrivileged(Nat

SSL + Solr 8.5.1 in cloud mode + Java 8

2020-07-06 Thread Natarajan, Rajeswari
Hi,

We are using Solr 8.5.1 in cloud mode  with Java 8. We are enabling  TLS  with 
http1  (as we get a warning java 8 + solr 8.5 SSL can’t be enabled) and we get 
below exception



2020-07-07 03:58:53.078 ERROR (main) [   ] o.a.s.c.SolrCore 
null:org.apache.solr.common.SolrException: Error instantiating 
shardHandlerFactory class [HttpShardHandlerFactory]: 
java.lang.UnsupportedOperationException: X509ExtendedKeyManager only supported 
on Server
  at 
org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:56)
  at org.apache.solr.core.CoreContainer.load(CoreContainer.java:647)
  at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:263)
  at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:183)
  at 
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:134)
  at 
org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
  at 
java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
  at 
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
  at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:360)
  at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1445)
  at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1409)
  at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:822)
  at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:275)
  at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46)
  at 
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
  at 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:513)
  at 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:154)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:173)
  at 
org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:447)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:66)
  at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:784)
  at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:753)
  at org.eclipse.jetty.util.Scanner.scan(Scanner.java:641)
  at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:540)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:146)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:599)
  at 
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:249)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
  at org.eclipse.jetty.server.Server.start(Server.java:407)
  at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
  at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:100)
  at org.eclipse.jetty.server.Server.doStart(Server.java:371)
  at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
  at 
org.eclipse.jetty.xml.XmlConfiguration.lambda$main$0(XmlConfiguration.java:1888)
  at java.security.AccessController.doPrivileged(Native Method)
  at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1837)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.eclipse.jetty.start.Main.invokeMain(Main.java:218)
  at org.eclipse.jetty.start.Main.start(Main.java:491)
  at org.eclipse.jetty.start.Main.main(Main.java:77)
Caused by: java.lang.RuntimeException: java.lang.UnsupportedOperationException: 
X509ExtendedKeyManager only supported on Server

Re: From solr to solr cloud

2019-12-06 Thread Erick Erickson
Because you use individual collections, you really don’t have to care 
about getting it all right up front.

Each collection can be created on a specified set of nodes, see the 
“createNodeSet”
parameter of the collections API “CREATE” command. 

And let’s say you figure out later that you need more hardware and want to move
some of your existing collections to new hardware. Use the MOVEREPLICA API
command.

So say you start out with 1 machine hosting 500 collections.
You get more and more and more clients and your machine gets overloaded. Or
one of your collections grows disproportionately to the others. You spin up a
new machine and MOVEREPLICA for some number of replicas on your
original machine to the new hardware.

Also consider that at some point, it may be desirable to have multiple “pods”.
By that I mean it can get awkward to have thousands of collections hosted on
a single Zookeeper ensemble. Again, because you have individual collections
you can declare one “pod” (Zookeeper + Solr nodes) full and spin up
another one, i.e. totally separate hardware, separate ZK ensembles. The pods
don’t know about each other at all.

Best,
Erick

> On Dec 6, 2019, at 3:12 AM, Vignan Malyala  wrote:
> 
> Hi Shawn,
> 
> Thanks for your response!
> 
> Yes! 500 collections.
> Each collection/core has around 50k to 50L documents/jsons (depending upon
> the client). We made one core for each client. Each json has 15 fields.
> It already in production as as Solr stand alone server.
> We want to use SolrCloud for it now, so as to make it scalable for future
> safety. How do I make it possible?
> 
> As per your response, I understood that, I have to create 3 zookeeper
> instances and some machines that house 1 solr node each.
> Is that the optimized solution? *And how many machines do I need to build
> to house solr nodes keeping in mind 500 collections?*
> 
> Thanks in advance!
> 
> On Fri, Dec 6, 2019 at 11:44 AM Shawn Heisey  wrote:
> 
>> On 12/5/2019 12:28 PM, Vignan Malyala wrote:
>>> I currently have 500 collections in my stand alone solr. Bcoz of day by
>> day
>>> increase in Data, I want to convert it into solr cloud.
>>> Can you suggest me how to do it successfully.
>>> How many shards should be there?
>>> How many nodes should be there?
>>> Are so called nodes different machines i should take?
>>> How many zoo keeper nodes should be there?
>>> Are so called zoo keeper nodes different machines i should take?
>>> Total how many machines i have to take to implement scalable solr cloud?
>> 
>> 500 collections is large enough that running it in SolrCloud is likely
>> to encounter scalability issues.  SolrCloud's design does not do well
>> with that many collections in the cluster, even if there are a lot of
>> machines.
>> 
>> There's a lot of comment history on this issue:
>> 
>> https://issues.apache.org/jira/browse/SOLR-7191
>> 
>> Generally speaking, each machine should only house one Solr node,
>> whether you're running cloud or not.  If each one requires a really huge
>> heap, it might be worthwhile to split it, but that's the only time I
>> would do so.  And I would generally prefer to add more machines than to
>> run multiple Solr nodes on one machine.
>> 
>> One thing you might do, if the way your data is divided will permit it,
>> is to run multiple SolrCloud clusters.  Multiple clusters can all use
>> one ZooKeeper ensemble.
>> 
>> ZooKeeper requires a minimum of three machines for fault tolerance.
>> With 3 or 4 machines in the ensemble, you can survive one machine
>> failure.  To survive two failures requires at least 5 machines.
>> 
>> Thanks,
>> Shawn
>> 



Re: From solr to solr cloud

2019-12-06 Thread Vignan Malyala
Hi Shawn,

Thanks for your response!

Yes! 500 collections.
Each collection/core has around 50k to 50L documents/jsons (depending upon
the client). We made one core for each client. Each json has 15 fields.
It already in production as as Solr stand alone server.
We want to use SolrCloud for it now, so as to make it scalable for future
safety. How do I make it possible?

As per your response, I understood that, I have to create 3 zookeeper
instances and some machines that house 1 solr node each.
Is that the optimized solution? *And how many machines do I need to build
to house solr nodes keeping in mind 500 collections?*

Thanks in advance!

On Fri, Dec 6, 2019 at 11:44 AM Shawn Heisey  wrote:

> On 12/5/2019 12:28 PM, Vignan Malyala wrote:
> > I currently have 500 collections in my stand alone solr. Bcoz of day by
> day
> > increase in Data, I want to convert it into solr cloud.
> > Can you suggest me how to do it successfully.
> > How many shards should be there?
> > How many nodes should be there?
> > Are so called nodes different machines i should take?
> > How many zoo keeper nodes should be there?
> > Are so called zoo keeper nodes different machines i should take?
> > Total how many machines i have to take to implement scalable solr cloud?
>
> 500 collections is large enough that running it in SolrCloud is likely
> to encounter scalability issues.  SolrCloud's design does not do well
> with that many collections in the cluster, even if there are a lot of
> machines.
>
> There's a lot of comment history on this issue:
>
> https://issues.apache.org/jira/browse/SOLR-7191
>
> Generally speaking, each machine should only house one Solr node,
> whether you're running cloud or not.  If each one requires a really huge
> heap, it might be worthwhile to split it, but that's the only time I
> would do so.  And I would generally prefer to add more machines than to
> run multiple Solr nodes on one machine.
>
> One thing you might do, if the way your data is divided will permit it,
> is to run multiple SolrCloud clusters.  Multiple clusters can all use
> one ZooKeeper ensemble.
>
> ZooKeeper requires a minimum of three machines for fault tolerance.
> With 3 or 4 machines in the ensemble, you can survive one machine
> failure.  To survive two failures requires at least 5 machines.
>
> Thanks,
> Shawn
>


Re: From solr to solr cloud

2019-12-06 Thread Vignan Malyala
Yes! 500 collections.
Each collection/core has around 50k to 50L documents/jsons (depending upon
the client). We made one core for each client. Each json has 15 fields.
It already in production as as Solr stand alone server.

We want to use SolrCloud for it now, so as to make it scalable for future
safety. How do I make it possible (obviously with minimum cost)?

On Fri, Dec 6, 2019 at 11:14 AM Paras Lehana 
wrote:

> Do you mean 500 cores? Tell us about the data more. How many documents per
> core do you have or what performance issues are you facing?
>
> On Fri, 6 Dec 2019 at 01:01, David Hastings 
> wrote:
>
> > are you noticing performance decreases in stand alone solr as of now?
> >
> > On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala 
> > wrote:
> >
> > > Hi
> > > I currently have 500 collections in my stand alone solr. Bcoz of day by
> > day
> > > increase in Data, I want to convert it into solr cloud.
> > > Can you suggest me how to do it successfully.
> > > How many shards should be there?
> > > How many nodes should be there?
> > > Are so called nodes different machines i should take?
> > > How many zoo keeper nodes should be there?
> > > Are so called zoo keeper nodes different machines i should take?
> > > Total how many machines i have to take to implement scalable solr
> cloud?
> > >
> > > Plz detail these questions. Any of documents on web aren't clear for
> > > production environments.
> > > Thanks in advance.
> > >
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>
> --
> *
> *
>
>  <https://www.facebook.com/IndiaMART/videos/578196442936091/>
>


Re: From solr to solr cloud

2019-12-05 Thread Shawn Heisey

On 12/5/2019 12:28 PM, Vignan Malyala wrote:

I currently have 500 collections in my stand alone solr. Bcoz of day by day
increase in Data, I want to convert it into solr cloud.
Can you suggest me how to do it successfully.
How many shards should be there?
How many nodes should be there?
Are so called nodes different machines i should take?
How many zoo keeper nodes should be there?
Are so called zoo keeper nodes different machines i should take?
Total how many machines i have to take to implement scalable solr cloud?


500 collections is large enough that running it in SolrCloud is likely 
to encounter scalability issues.  SolrCloud's design does not do well 
with that many collections in the cluster, even if there are a lot of 
machines.


There's a lot of comment history on this issue:

https://issues.apache.org/jira/browse/SOLR-7191

Generally speaking, each machine should only house one Solr node, 
whether you're running cloud or not.  If each one requires a really huge 
heap, it might be worthwhile to split it, but that's the only time I 
would do so.  And I would generally prefer to add more machines than to 
run multiple Solr nodes on one machine.


One thing you might do, if the way your data is divided will permit it, 
is to run multiple SolrCloud clusters.  Multiple clusters can all use 
one ZooKeeper ensemble.


ZooKeeper requires a minimum of three machines for fault tolerance. 
With 3 or 4 machines in the ensemble, you can survive one machine 
failure.  To survive two failures requires at least 5 machines.


Thanks,
Shawn


Re: From solr to solr cloud

2019-12-05 Thread Paras Lehana
Do you mean 500 cores? Tell us about the data more. How many documents per
core do you have or what performance issues are you facing?

On Fri, 6 Dec 2019 at 01:01, David Hastings 
wrote:

> are you noticing performance decreases in stand alone solr as of now?
>
> On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala 
> wrote:
>
> > Hi
> > I currently have 500 collections in my stand alone solr. Bcoz of day by
> day
> > increase in Data, I want to convert it into solr cloud.
> > Can you suggest me how to do it successfully.
> > How many shards should be there?
> > How many nodes should be there?
> > Are so called nodes different machines i should take?
> > How many zoo keeper nodes should be there?
> > Are so called zoo keeper nodes different machines i should take?
> > Total how many machines i have to take to implement scalable solr cloud?
> >
> > Plz detail these questions. Any of documents on web aren't clear for
> > production environments.
> > Thanks in advance.
> >
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>


Re: From solr to solr cloud

2019-12-05 Thread David Hastings
are you noticing performance decreases in stand alone solr as of now?

On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala  wrote:

> Hi
> I currently have 500 collections in my stand alone solr. Bcoz of day by day
> increase in Data, I want to convert it into solr cloud.
> Can you suggest me how to do it successfully.
> How many shards should be there?
> How many nodes should be there?
> Are so called nodes different machines i should take?
> How many zoo keeper nodes should be there?
> Are so called zoo keeper nodes different machines i should take?
> Total how many machines i have to take to implement scalable solr cloud?
>
> Plz detail these questions. Any of documents on web aren't clear for
> production environments.
> Thanks in advance.
>


From solr to solr cloud

2019-12-05 Thread Vignan Malyala
Hi
I currently have 500 collections in my stand alone solr. Bcoz of day by day
increase in Data, I want to convert it into solr cloud.
Can you suggest me how to do it successfully.
How many shards should be there?
How many nodes should be there?
Are so called nodes different machines i should take?
How many zoo keeper nodes should be there?
Are so called zoo keeper nodes different machines i should take?
Total how many machines i have to take to implement scalable solr cloud?

Plz detail these questions. Any of documents on web aren't clear for
production environments.
Thanks in advance.


Re: Migrate Solr Master To Cloud 7.5

2019-03-21 Thread Erick Erickson
Yeah, the link you referenced will work. It is _very important_ that you create 
your collection with exactly one shard then do the copy.

After that you can use SPLITSHARD to sub-divide it. This is a costly operation, 
but probably not as costly as re-indexing.

That said, it might be easier to just create a new collection with shards and 
re-index, it depends of course on how painful reindexing is….

Best,
Erick

> On Mar 21, 2019, at 8:55 AM, IZaBEE_Keeper  wrote:
> 
> Hi..
> 
> I have a large Solr 7.5 index over 150M docs and 800GB in a master slave
> setup.. I need to migrate the core to a Solr Cloud instance with pull
> replicas as the index will be exceeding the 2.2B doc limit for a single
> core.. 
> 
> I found this..
> http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-td4149920.html
> <http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-td4149920.html>
>   
> It's a bit out dated but sounds like it might work..
> 
> Does anyone have any other advice/links for this type of migration? 
> 
> Right now I just need to convert the master to cloud before it gets much
> bigger.. Re-indexing is an option but I would rather convert which is likely
> much faster..
> 
> Thanks.. 
> 
> 
> 
> -
> Bee Keeper at IZaBEE.com
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Migrate Solr Master To Cloud 7.5

2019-03-21 Thread IZaBEE_Keeper
Hi..

I have a large Solr 7.5 index over 150M docs and 800GB in a master slave
setup.. I need to migrate the core to a Solr Cloud instance with pull
replicas as the index will be exceeding the 2.2B doc limit for a single
core.. 

I found this..
http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-td4149920.html
<http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-td4149920.html>
  
It's a bit out dated but sounds like it might work..

Does anyone have any other advice/links for this type of migration? 

Right now I just need to convert the master to cloud before it gets much
bigger.. Re-indexing is an option but I would rather convert which is likely
much faster..

Thanks.. 



-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread Shawn Heisey

On 4/26/2018 11:02 AM, THADC wrote:

ok, I am creating myConfigset from the _default configset. So, I am copying
ALL the files under _default/conf to myConfigset/conf. I am only modifying
the schema.xml, and then I will re-upload to ZP.

Question:  Do I need to modify any of the other copied files in the conf dir
(e.g, solrconfig.xml, etc)?


Uploading files to ZK when using the methods provided with Solr does not 
erase anything, so you only need to actually upload what you have changed.


Having said that ... in general I would recommend dealing with the 
entire configuration as a unit and uploading all of it, but that is not 
a strict requirement.


I would also put the configurations into some kind of source control 
(git, svn, etc).  That way you have a record of all changes.


Thanks,
Shawn



Re: Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread THADC
ok, I am creating myConfigset from the _default configset. So, I am copying
ALL the files under _default/conf to myConfigset/conf. I am only modifying
the schema.xml, and then I will re-upload to ZP.

Question:  Do I need to modify any of the other copied files in the conf dir
(e.g, solrconfig.xml, etc)?

thank you



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread Shawn Heisey

On 4/26/2018 10:30 AM, THADC wrote:

Shawn, thanks for the reply. The issue is that I need to modify the
schema.xml file to add my customizations. Are you saying I cannot manually
access the config set to modify the schema file? If not, how do I modify it?


Make the change in a copy of the config, then re-upload the changed 
config to ZK.  Then reload the collection or restart Solr instances.


Thanks,
Shawn



Re: Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread THADC
Shawn, thanks for the reply. The issue is that I need to modify the
schema.xml file to add my customizations. Are you saying I cannot manually
access the config set to modify the schema file? If not, how do I modify it?

Thanks, Tim



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread Shawn Heisey

On 4/26/2018 10:16 AM, THADC wrote:

I am pretty certain I created it because when I execute the request to list
all configsets:

http://localhost:8983/solr/admin/configs?action=LIST

, I see my configset in the response:

{
"responseHeader":{
"status":0,
"QTime":1},
"configSets":["_default",
"mySolrCloudConfigSet"]
}

be able to find the configset on the filesystem so I can update the
schema.xml file with my system's customizations.

I would be grateful for any ideas. Thanks


In SolrCloud mode, active configs are in zookeeper, not on the disk.  
This is not new behavior.  It has always worked this way.


When you use the CREATE action on the configset API, you are creating a 
new configset in zookeeper.  You will not find it on the disk.


https://lucene.apache.org/solr/guide/7_0/configsets-api.html#configsets-create

Thanks,
Shawn



Cannot Find My New Solr Configset (Solr Cloud 7.3.0)

2018-04-26 Thread THADC
Hello,
I am migrating from solr 4.7 to solr cloud 7.3.0. I am trying to create a
new custom configset set based on a default (_default) that came with the
installation. I followed the instructions and used the following call:

http://localhost:8983/solr/admin/configs?action=CREATE=mySolrCloudConfigSet=_default

,getting the following response:

{
"responseHeader":{
"status":0,
"QTime":202}
}

, which I believe indicates success. However, I cannot find the new
configset where I would have expected to find it (i.e.,
~solr-7.3.0\server\solr\configsets). This is where the _default configset is
located. More specifically, I cannot find my created configset anywhere
under solr-7.3.0.

I am pretty certain I created it because when I execute the request to list
all configsets:

http://localhost:8983/solr/admin/configs?action=LIST

, I see my configset in the response:

{
"responseHeader":{
"status":0,
"QTime":1},
"configSets":["_default",
"mySolrCloudConfigSet"]
}

be able to find the configset on the filesystem so I can update the
schema.xml file with my system's customizations.

I would be grateful for any ideas. Thanks



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-07 Thread Aman Tandon
yes it might, but I don't wants to depend on servlet container because
after few release solr will not support any servlet container. So i am
sticking with the command line utilities.

With Regards
Aman Tandon

On Fri, Mar 6, 2015 at 3:58 PM, Rajesh Hazari rajeshhaz...@gmail.com
wrote:

 zkhost=hostnames,
 port=some port
 variables in your solr.xml should work?
 I have tested this with tomcat not with jetty, this stays with your config.

 Rajesh.
 On Mar 5, 2015 9:20 PM, Aman Tandon amantandon...@gmail.com wrote:

  Thanks shamik :)
 
  With Regards
  Aman Tandon
 
  On Fri, Mar 6, 2015 at 3:30 AM, shamik sham...@gmail.com wrote:
 
   The other way you can do that is to specify the startup parameters in
   solr.in.sh.
  
   Example :
  
   SOLR_MODE=solrcloud
  
   ZK_HOST=zoohost1:2181,zoohost2:2181,zoohost3:2181
  
   SOLR_PORT=4567
  
   You can simply start solr by running ./solr start
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/How-to-start-solr-in-solr-cloud-mode-using-external-zookeeper-tp4190630p4191286.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 



Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-06 Thread Rajesh Hazari
zkhost=hostnames,
port=some port
variables in your solr.xml should work?
I have tested this with tomcat not with jetty, this stays with your config.

Rajesh.
On Mar 5, 2015 9:20 PM, Aman Tandon amantandon...@gmail.com wrote:

 Thanks shamik :)

 With Regards
 Aman Tandon

 On Fri, Mar 6, 2015 at 3:30 AM, shamik sham...@gmail.com wrote:

  The other way you can do that is to specify the startup parameters in
  solr.in.sh.
 
  Example :
 
  SOLR_MODE=solrcloud
 
  ZK_HOST=zoohost1:2181,zoohost2:2181,zoohost3:2181
 
  SOLR_PORT=4567
 
  You can simply start solr by running ./solr start
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/How-to-start-solr-in-solr-cloud-mode-using-external-zookeeper-tp4190630p4191286.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-05 Thread shamik
The other way you can do that is to specify the startup parameters in
solr.in.sh. 

Example :

SOLR_MODE=solrcloud

ZK_HOST=zoohost1:2181,zoohost2:2181,zoohost3:2181

SOLR_PORT=4567

You can simply start solr by running ./solr start



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-start-solr-in-solr-cloud-mode-using-external-zookeeper-tp4190630p4191286.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-05 Thread Aman Tandon
Thanks shamik :)

With Regards
Aman Tandon

On Fri, Mar 6, 2015 at 3:30 AM, shamik sham...@gmail.com wrote:

 The other way you can do that is to specify the startup parameters in
 solr.in.sh.

 Example :

 SOLR_MODE=solrcloud

 ZK_HOST=zoohost1:2181,zoohost2:2181,zoohost3:2181

 SOLR_PORT=4567

 You can simply start solr by running ./solr start



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-start-solr-in-solr-cloud-mode-using-external-zookeeper-tp4190630p4191286.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-05 Thread Aman Tandon
Thanks Erick.

So for the other audience who got stuck in same situation. Here is the
solution.

If you are able to run the remote/local zookeeper ensemble, then you can
create the Solr Cluster by the following method.

Suppose you have an zookeeper ensemble of 3 zookeeper server running on
three different machines which has the IP addresses as :192.168.11.12,
192.168.101.12, 192.168.101.92 and these machines are using the zookeeper
client port as 2181 for every machine (as mentioned in zoo.cfg) and in my
case I am using the solr-5.0.0 version

Now go to the bin directory of your extracted solr tar/zip file and run
this command for each solr server of your SolrCloud cluster.

./solr start -c -z 192.168.11.12:2181,192.168.101.12:2181,
192.168.101.92:2181 -p 4567

-p - for specifying the another port number other than 8983 in my case it
is 4567
-c - to start server in cloud mode
-z - to specifying the zookeeper host address

With Regards
Aman Tandon

On Wed, Mar 4, 2015 at 5:18 AM, Erick Erickson erickerick...@gmail.com
wrote:

 Have you seen this page?:

 https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference

 This is really the new way

 Best,
 Erick

 On Tue, Mar 3, 2015 at 7:18 AM, Aman Tandon amantandon...@gmail.com
 wrote:
  Thanks Shawn, also thanks for sharing info about chroot.
 
  I am trying to implement the solr cloud with solr-5.0.0.
 
  I also checked the documentations https://wiki.apache.org/solr/SolrCloud
 ,
  the method shown there is using start.jar. But after few update start.jar
  (jetty) will not work. So I want to go through the way which will work as
  it is even after upgrade.
 
  So how could i start it from bin directory with all these parameters of
  external zookeeper or any other best way which you can suggest.
 
  With Regards
  Aman Tandon
 
  On Tue, Mar 3, 2015 at 8:09 PM, Shawn Heisey apa...@elyograg.org
 wrote:
 
  On 3/3/2015 4:21 AM, Aman Tandon wrote:
   I am new to solr-cloud, i have connected the zookeepers located on 3
  remote
   servers. All the configs are uploaded and linked successfully.
  
   Now i am stuck to how to start solr in cloud mode using these external
   zookeeper which are remotely located.
  
   Zookeeper is installed at 3 servers and using the 2181 as client
 port. ON
   all three server, solr server along with external zookeeper is
 present.
  
   solrcloud1.com (solr + zookeper is present)
   solrcloud2.com
   solrcloud3.com
  
   Now i have to start the solr by telling the solr to use the external
   zookeeper. So how should I do that.
 
  You simply tell Solr about all your zookeeper servers on startup, using
  the zkHost property.  Here's the format of that property:
 
  server1:port,server2:port,server3:port/solr1
 
  The /solr1 part (the ZK chroot) is optional, but I recommend it ... it
  can be just about any text you like, starting with a forward slash.
  What this does is put all of SolrCloud's information inside a path in
  zookeeper, sort of like a filesystem.  With no chroot, that information
  is placed at the root of zookeeper.  If you want to use a zookeeper
  ensemble for multiple applications, you're going to need a chroot.  Even
  when multiple applications are not required, I recommend it to keep the
  zookeeper root clean.
 
  You can see some examples of zkHost values in the javadoc for SolrJ:
 
 
 
 http://lucene.apache.org/solr/5_0_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html#CloudSolrClient%28java.lang.String%29
 
  Thanks,
  Shawn
 
 



Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-03 Thread Tomoko Uchida
Hi,

Did you check SolrCloud section in Ref Guides? You can download PDFs from
here.
http://archive.apache.org/dist/lucene/solr/ref-guide/

Or this. (It's already marked out of date, but still provides basic,
helpful information.)
https://wiki.apache.org/solr/SolrCloud

Regards,
Tomoko



2015-03-03 20:21 GMT+09:00 Aman Tandon amantandon...@gmail.com:

 Hi,

 I am new to solr-cloud, i have connected the zookeepers located on 3 remote
 servers. All the configs are uploaded and linked successfully.

 Now i am stuck to how to start solr in cloud mode using these external
 zookeeper which are remotely located.

 Zookeeper is installed at 3 servers and using the 2181 as client port. ON
 all three server, solr server along with external zookeeper is present.

 solrcloud1.com (solr + zookeper is present)
 solrcloud2.com
 solrcloud3.com

 Now i have to start the solr by telling the solr to use the external
 zookeeper. So how should I do that.

 Thanks in advance.

 With Regards
 Aman Tandon



Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-03 Thread Aman Tandon
Thanks Shawn, also thanks for sharing info about chroot.

I am trying to implement the solr cloud with solr-5.0.0.

I also checked the documentations https://wiki.apache.org/solr/SolrCloud,
the method shown there is using start.jar. But after few update start.jar
(jetty) will not work. So I want to go through the way which will work as
it is even after upgrade.

So how could i start it from bin directory with all these parameters of
external zookeeper or any other best way which you can suggest.

With Regards
Aman Tandon

On Tue, Mar 3, 2015 at 8:09 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/3/2015 4:21 AM, Aman Tandon wrote:
  I am new to solr-cloud, i have connected the zookeepers located on 3
 remote
  servers. All the configs are uploaded and linked successfully.
 
  Now i am stuck to how to start solr in cloud mode using these external
  zookeeper which are remotely located.
 
  Zookeeper is installed at 3 servers and using the 2181 as client port. ON
  all three server, solr server along with external zookeeper is present.
 
  solrcloud1.com (solr + zookeper is present)
  solrcloud2.com
  solrcloud3.com
 
  Now i have to start the solr by telling the solr to use the external
  zookeeper. So how should I do that.

 You simply tell Solr about all your zookeeper servers on startup, using
 the zkHost property.  Here's the format of that property:

 server1:port,server2:port,server3:port/solr1

 The /solr1 part (the ZK chroot) is optional, but I recommend it ... it
 can be just about any text you like, starting with a forward slash.
 What this does is put all of SolrCloud's information inside a path in
 zookeeper, sort of like a filesystem.  With no chroot, that information
 is placed at the root of zookeeper.  If you want to use a zookeeper
 ensemble for multiple applications, you're going to need a chroot.  Even
 when multiple applications are not required, I recommend it to keep the
 zookeeper root clean.

 You can see some examples of zkHost values in the javadoc for SolrJ:


 http://lucene.apache.org/solr/5_0_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html#CloudSolrClient%28java.lang.String%29

 Thanks,
 Shawn




Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-03 Thread Erick Erickson
Have you seen this page?:
https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference

This is really the new way

Best,
Erick

On Tue, Mar 3, 2015 at 7:18 AM, Aman Tandon amantandon...@gmail.com wrote:
 Thanks Shawn, also thanks for sharing info about chroot.

 I am trying to implement the solr cloud with solr-5.0.0.

 I also checked the documentations https://wiki.apache.org/solr/SolrCloud,
 the method shown there is using start.jar. But after few update start.jar
 (jetty) will not work. So I want to go through the way which will work as
 it is even after upgrade.

 So how could i start it from bin directory with all these parameters of
 external zookeeper or any other best way which you can suggest.

 With Regards
 Aman Tandon

 On Tue, Mar 3, 2015 at 8:09 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 3/3/2015 4:21 AM, Aman Tandon wrote:
  I am new to solr-cloud, i have connected the zookeepers located on 3
 remote
  servers. All the configs are uploaded and linked successfully.
 
  Now i am stuck to how to start solr in cloud mode using these external
  zookeeper which are remotely located.
 
  Zookeeper is installed at 3 servers and using the 2181 as client port. ON
  all three server, solr server along with external zookeeper is present.
 
  solrcloud1.com (solr + zookeper is present)
  solrcloud2.com
  solrcloud3.com
 
  Now i have to start the solr by telling the solr to use the external
  zookeeper. So how should I do that.

 You simply tell Solr about all your zookeeper servers on startup, using
 the zkHost property.  Here's the format of that property:

 server1:port,server2:port,server3:port/solr1

 The /solr1 part (the ZK chroot) is optional, but I recommend it ... it
 can be just about any text you like, starting with a forward slash.
 What this does is put all of SolrCloud's information inside a path in
 zookeeper, sort of like a filesystem.  With no chroot, that information
 is placed at the root of zookeeper.  If you want to use a zookeeper
 ensemble for multiple applications, you're going to need a chroot.  Even
 when multiple applications are not required, I recommend it to keep the
 zookeeper root clean.

 You can see some examples of zkHost values in the javadoc for SolrJ:


 http://lucene.apache.org/solr/5_0_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html#CloudSolrClient%28java.lang.String%29

 Thanks,
 Shawn




How to start solr in solr cloud mode using external zookeeper ?

2015-03-03 Thread Aman Tandon
Hi,

I am new to solr-cloud, i have connected the zookeepers located on 3 remote
servers. All the configs are uploaded and linked successfully.

Now i am stuck to how to start solr in cloud mode using these external
zookeeper which are remotely located.

Zookeeper is installed at 3 servers and using the 2181 as client port. ON
all three server, solr server along with external zookeeper is present.

solrcloud1.com (solr + zookeper is present)
solrcloud2.com
solrcloud3.com

Now i have to start the solr by telling the solr to use the external
zookeeper. So how should I do that.

Thanks in advance.

With Regards
Aman Tandon


Re: How to start solr in solr cloud mode using external zookeeper ?

2015-03-03 Thread Shawn Heisey
On 3/3/2015 4:21 AM, Aman Tandon wrote:
 I am new to solr-cloud, i have connected the zookeepers located on 3 remote
 servers. All the configs are uploaded and linked successfully.
 
 Now i am stuck to how to start solr in cloud mode using these external
 zookeeper which are remotely located.
 
 Zookeeper is installed at 3 servers and using the 2181 as client port. ON
 all three server, solr server along with external zookeeper is present.
 
 solrcloud1.com (solr + zookeper is present)
 solrcloud2.com
 solrcloud3.com
 
 Now i have to start the solr by telling the solr to use the external
 zookeeper. So how should I do that.

You simply tell Solr about all your zookeeper servers on startup, using
the zkHost property.  Here's the format of that property:

server1:port,server2:port,server3:port/solr1

The /solr1 part (the ZK chroot) is optional, but I recommend it ... it
can be just about any text you like, starting with a forward slash.
What this does is put all of SolrCloud's information inside a path in
zookeeper, sort of like a filesystem.  With no chroot, that information
is placed at the root of zookeeper.  If you want to use a zookeeper
ensemble for multiple applications, you're going to need a chroot.  Even
when multiple applications are not required, I recommend it to keep the
zookeeper root clean.

You can see some examples of zkHost values in the javadoc for SolrJ:

http://lucene.apache.org/solr/5_0_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html#CloudSolrClient%28java.lang.String%29

Thanks,
Shawn



Re: Copy existing index from standalone Solr to Solr cloud

2014-07-30 Thread avgxm
Used the admin/collections?action=SPLITSHARD, to create shard1_0, shard1_1,
and then followed this thread
http://lucene.472066.n3.nabble.com/How-can-you-move-a-shard-from-one-SolrCloud-node-to-another-td4106815.html
to move the shards to the right nodes.  Problem solved.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-tp4149920p4150163.html
Sent from the Solr - User mailing list archive at Nabble.com.


Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread avgxm
Is there a correct way to take an existing Solr index (core.properties,
conf/, data/ directories from a standalone Solr instance) and copy it over
to a Solr cloud, with shards, without having to use import or re-indexing? 
Does anyone know the proper steps to accomplish this type of a move?  The
target system is zookeeper 3.4.6, tomcat 7.0.54, and solr 4.8.1.  I have
been able to copy the data and load the core by executing upconfig,
linkconfig to zookeeper, and then copying over the core.properties, and
conf/ and data/ directories, bouncing tomcat.  The core comes up and is
searchable.  The cloud pic looks like corename  shard1 
ip_addr:8080.  Then, I have tried to use split core, split shard, create
core, without success to try and add shard2 and shard3, either on the same
or different hosts.  Not sure what I'm missing or if this way of reusing the
existing data is even an option.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-tp4149920.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread Shawn Heisey
On 7/29/2014 2:23 PM, avgxm wrote:
 Is there a correct way to take an existing Solr index (core.properties,
 conf/, data/ directories from a standalone Solr instance) and copy it over
 to a Solr cloud, with shards, without having to use import or re-indexing? 
 Does anyone know the proper steps to accomplish this type of a move?  The
 target system is zookeeper 3.4.6, tomcat 7.0.54, and solr 4.8.1.  I have
 been able to copy the data and load the core by executing upconfig,
 linkconfig to zookeeper, and then copying over the core.properties, and
 conf/ and data/ directories, bouncing tomcat.  The core comes up and is
 searchable.  The cloud pic looks like corename  shard1 
 ip_addr:8080.  Then, I have tried to use split core, split shard, create
 core, without success to try and add shard2 and shard3, either on the same
 or different hosts.  Not sure what I'm missing or if this way of reusing the
 existing data is even an option.

You'll need to create a collection with the Collections API (which also
creates the cores) before you try copying anything, and then you'll want
to copy *only* the data directory -- the config is in zookeeper and
the core.properties file should already exist.

When you create the collection, you'll likely want numShards on the
CREATE call to be 1, and replicationFactor should be whatever you want
-- if it's 2, you'll end up with two copies of your index on different
servers.  If the collection is named test then the core on the first
server will be named test_shard1_replica1, the core on the second server
will be named test_shard1_replica2, and so on.  Your zookeeper ensemble
should be separate from Solr, don't use the -DzkRun option.

To put the existing data into the collection:  1) Shut down all the Solr
servers. 2) Delete the data and tlog directories from
_shard1_replicaN on all the servers.  3) Copy the data directory
from the source to the first server, then start Solr on that server.  4)
Wait a few minutes for everything to stabilize.  5) Start Solr on any
other servers.

Thanks,
Shawn



Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread avgxm
Thanks a lot, Shawn.  I have gotten as far as having the core come up per
your instructions.  Since numShards was set to 1, what is the next step to
add more shards?  Is it /admin/collections?action=CREATESHARD... or
something else?  Ultimately, I'd like to have shard1, shard2, shard3, with
router:{name:compositeId}, where each core is leader:true.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-tp4149920p4149934.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread Anshum Gupta
Use the Split shard API to split an existing shard. It splits an
existing shard into 2.
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3

You can then split the sub-shards. One thing to note though is that
the Admin UI still doesn't comprehend the difference between active
and inactive shards. You can look at the cluster state to get a better
picture of what shards are active.

Another thing to note would be to use the async mode while splitting a
rather big shard. That would help you overcome the internal timeout
issues of a long running task.

On Tue, Jul 29, 2014 at 2:10 PM, avgxm gilel...@hotmail.com wrote:
 Thanks a lot, Shawn.  I have gotten as far as having the core come up per
 your instructions.  Since numShards was set to 1, what is the next step to
 add more shards?  Is it /admin/collections?action=CREATESHARD... or
 something else?  Ultimately, I'd like to have shard1, shard2, shard3, with
 router:{name:compositeId}, where each core is leader:true.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-tp4149920p4149934.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 

Anshum Gupta
http://www.anshumgupta.net


Re: Solr 4.5 - Solr Cloud is creating new cores on random nodes

2013-12-19 Thread Mark Miller
Sounds pretty weird. I would use 4.5.1. Don’t know that it will address this, 
but it’s a very good idea.

This doesn’t sound like a feature to me. I’d file a JIRA issue if it seems like 
a real problem.

Are you using the old style solr.xml with cores defined in it or the new core 
discovery mode (cores are not defined in solr.xml)?

- Mark

On Dec 18, 2013, at 6:30 PM, Ryan Wilson rpwils...@gmail.com wrote:

 Hello all,
 
 I am currently in the process of building out a solr cloud with solr 4.5 on
 4 nodes with some pretty hefty hardware. When we create the collection we
 have a replication factor of 2 and store 2 replicas per node.
 
 While we have been experimenting, which has involved bringing nodes up and
 down as well as tanking them with OOM errors while messing with jvm
 settings, we have observed a disturbing trend where we will bring nodes
 back up and suddenly shard x has 6 replicas spread across the nodes. These
 replicas will have been created with no action on our part and we would
 much rather they not be created at all.
 
 I have not been able to determine whether this is a bug or a feature. If
 its a bug, I will happily provide what I can to track it down. If it is a
 feature, I would very much like to turn it off.
 
 Any Information is appreciated.
 
 Regards,
 Ryan Wilson
 rpwils...@gmail.com



Solr 4.5 - Solr Cloud is creating new cores on random nodes

2013-12-18 Thread Ryan Wilson
Hello all,

I am currently in the process of building out a solr cloud with solr 4.5 on
4 nodes with some pretty hefty hardware. When we create the collection we
have a replication factor of 2 and store 2 replicas per node.

While we have been experimenting, which has involved bringing nodes up and
down as well as tanking them with OOM errors while messing with jvm
settings, we have observed a disturbing trend where we will bring nodes
back up and suddenly shard x has 6 replicas spread across the nodes. These
replicas will have been created with no action on our part and we would
much rather they not be created at all.

I have not been able to determine whether this is a bug or a feature. If
its a bug, I will happily provide what I can to track it down. If it is a
feature, I would very much like to turn it off.

Any Information is appreciated.

Regards,
Ryan Wilson
rpwils...@gmail.com


Re: Solr 4.1 Solr Cloud Shard Structure

2013-03-01 Thread Otis Gospodnetic
Hi Chris,

I started a discussion on this topic on the ElasticSearch mailing list the
other day.  As soon as SolrCloud get index alias functionality (JIRA for it
exists) I believe the same approach to cluster expansion will be applicable
to SolrCloud as what can be done with ES today:

http://search-lucene.com/m/RZYhi2ydnXD1subj=Alternatives+to+oversharding+to+handle+index+cluster+growth+

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Feb 28, 2013 at 7:10 PM, Chris Simpson chrissimpson1...@outlook.com
 wrote:

 Dear Lucene / Solr Community-

 I recently posted this question on Stackoverflow, but it doesnt seem to be
 going too far. Then I found this mailing list and was hoping perhaps to
 have more luck:

 Question-

 If I plan on holding 7TB of data in a Solr Cloud, is it bad practice to
 begin with 1 server holding 100 shards and then begin populating the
 collection where once the size grew, each shard ultimately will be peeled
 off into its own dedicated server (holding ~70GB ea with its own dedicated
 resources and replicas)?

 That is, I would start the collection with 100 shards locally, then as
 data grew, I could peel off one shard at a time and give it its own server
 -- dedicated w/plenty of resources.

 Is this okay to do -- or would I somehow incur a massive bottleneck
 internally by putting that many shards in 1 server to start with while data
 was low?

 Thank you.
 Chris




Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Mark Miller
You will pay some in performance, but it's certainly not bad practice. It's a 
good choice for setting up so that you can scale later. You just have to do 
some testing to make sure it fits your requirments. The Collections API even 
has built in support for this - you can specify more shards than nodes and it 
will overload a node. See the documentation. Later you can start up a new 
replica on another machine and kill/remove the original.

- Mark

On Feb 28, 2013, at 7:10 PM, Chris Simpson chrissimpson1...@outlook.com wrote:

 Dear Lucene / Solr Community-
 
 I recently posted this question on Stackoverflow, but it doesnt seem to be 
 going too far. Then I found this mailing list and was hoping perhaps to have 
 more luck:
 
 Question-
 
 If I plan on holding 7TB of data in a Solr Cloud, is it bad practice to begin 
 with 1 server holding 100 shards and then begin populating the collection 
 where once the size grew, each shard ultimately will be peeled off into its 
 own dedicated server (holding ~70GB ea with its own dedicated resources and 
 replicas)?
 
 That is, I would start the collection with 100 shards locally, then as data 
 grew, I could peel off one shard at a time and give it its own server -- 
 dedicated w/plenty of resources.
 
 Is this okay to do -- or would I somehow incur a massive bottleneck 
 internally by putting that many shards in 1 server to start with while data 
 was low?
 
 Thank you.
 Chris
 
 



Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Walter Underwood
100 shards on a node will almost certainly be slow, but at least it would be 
scalable. 7TB of data on one node is going to be slow regardless of how you 
shard it.

I might choose a number with more useful divisors than 100, perhaps 96 or 144.

wunder

On Feb 28, 2013, at 4:25 PM, Mark Miller wrote:

 You will pay some in performance, but it's certainly not bad practice. It's a 
 good choice for setting up so that you can scale later. You just have to do 
 some testing to make sure it fits your requirments. The Collections API even 
 has built in support for this - you can specify more shards than nodes and it 
 will overload a node. See the documentation. Later you can start up a new 
 replica on another machine and kill/remove the original.
 
 - Mark
 
 On Feb 28, 2013, at 7:10 PM, Chris Simpson chrissimpson1...@outlook.com 
 wrote:
 
 Dear Lucene / Solr Community-
 
 I recently posted this question on Stackoverflow, but it doesnt seem to be 
 going too far. Then I found this mailing list and was hoping perhaps to have 
 more luck:
 
 Question-
 
 If I plan on holding 7TB of data in a Solr Cloud, is it bad practice to 
 begin with 1 server holding 100 shards and then begin populating the 
 collection where once the size grew, each shard ultimately will be peeled 
 off into its own dedicated server (holding ~70GB ea with its own dedicated 
 resources and replicas)?
 
 That is, I would start the collection with 100 shards locally, then as data 
 grew, I could peel off one shard at a time and give it its own server -- 
 dedicated w/plenty of resources.
 
 Is this okay to do -- or would I somehow incur a massive bottleneck 
 internally by putting that many shards in 1 server to start with while data 
 was low?
 
 Thank you.
 Chris
 






Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Mark Miller

On Feb 28, 2013, at 7:55 PM, Walter Underwood wun...@wunderwood.org wrote:

 100 shards on a node will almost certainly be slow

I think it depends on some things - with one of the largest of those things 
being your hardware. Many have found that you can get much better performance 
out of super concurrent, beefy hardware using more cores on a single node. So 
there will be some give and take that are tough jump to conclusions about. 
Slower at 100, I would assume yes, slow, depends.

One thing that will happen is that you will require a lot more threads…

You would want some pretty beefy hardware.

But you don't have to do 100 either. That should just be a rough starting 
number. At some point you have to reindex into a new cluster if you keep 
growing. Or consider shard splitting if its feasible (and becomes available). 
You can only over shard so much.

So perhaps you do 50 or whatever. It will be faster than you think I imagine. 
My main concern is the number of threads - might want to mess with Xss to 
minimize their ram usage at least.

- Mark

Michigan Information Retrieval Enthusiasts Group Quarterly Meetup - August 17th 2011 - Solr in the Cloud, Erick Erickson

2011-08-09 Thread Provalov, Ivan
Next IR Meetup will be held at Farmington Hills Community Library on August 17, 
2011.  Please RSVP here: 
http://www.meetup.com/Michigan-Information-Retrieval-Enthusiasts-Group

Thank you,

Ivan Provalov


Re: Solr and Tag Cloud

2011-06-19 Thread Alexey Serba
Consider you have multivalued field _tag_ related to every document in
your corpus. Then you can build tag cloud relevant for all data set or
specific query by retrieving facets for field _tag_ for *:* or any
other query. You'll get a list of popular _tag_ values relevant to
this query with occurrence counts.

If you want to build tag cloud for general analyzed text fields you
still can do that the same way, but you should note that you can hit
some performance/memory problems if you have significant data set and
huge text fields. You should probably use stop words to filter popular
general terms.

On Sat, Jun 18, 2011 at 8:12 AM, Jamie Johnson jej2...@gmail.com wrote:
 Does anyone have details of how to generate a tag cloud of popular terms
 across an entire data set and then also across a query?



Re: Solr and Tag Cloud

2011-06-18 Thread Mohammad Shariq
I am also looking for the same, Is there any way to find the cloud-tag of
all the documents matching a specific query.


On 18 June 2011 09:42, Jamie Johnson jej2...@gmail.com wrote:

 Does anyone have details of how to generate a tag cloud of popular terms
 across an entire data set and then also across a query?




-- 
Thanks and Regards
Mohammad Shariq


Re: Solr and Tag Cloud

2011-06-18 Thread Dmitry Kan
One option would be to load each term into shingles field and then facet on
them for the user query.
Another is to use http://wiki.apache.org/solr/TermsComponent.

With the first one you can load not only separate terms, but also their
sequences and then experiment with the optimal shingle sequence (ngram)
length.

On Sat, Jun 18, 2011 at 7:12 AM, Jamie Johnson jej2...@gmail.com wrote:

 Does anyone have details of how to generate a tag cloud of popular terms
 across an entire data set and then also across a query?




-- 
Regards,

Dmitry Kan


Solr and Tag Cloud

2011-06-17 Thread Jamie Johnson
Does anyone have details of how to generate a tag cloud of popular terms
across an entire data set and then also across a query?


Re: solr on the cloud

2011-03-26 Thread Dmitry Kan
Thanks, Jason, this looks very relevant!

On Fri, Mar 25, 2011 at 11:26 PM, Jason Rutherglen 
jason.rutherg...@gmail.com wrote:

 Dmitry,

 If you're planning on using HBase you can take a look at
 https://issues.apache.org/jira/browse/HBASE-3529  I think we may even
 have a reasonable solution for reading the index [randomly] out of
 HDFS.  Benchmarking'll be implemented next.  It's not production
 ready, suggestions are welcome.

 Jason

 On Fri, Mar 25, 2011 at 2:03 PM, Dmitry Kan dmitry@gmail.com wrote:
  Hi Otis,
 
  Thanks for elaborating on this and the link (funny!).
 
  I have quite a big dataset growing all the time. The problems that I
 start
  facing are pretty much predictable:
  1. Scalability: this inludes indexing time (now some days!, better hours
 or
  even minutes, if that's possible) along with handling the rapid growth
  2. Robustness: the entire system (distributed or single server or
 anything
  else) should be fault-tolerant, e.g. if one shard goes down, other
 catches
  up (master-slave scheme)
  3. Some apps that we run on SOLR are pretty computationally demanding,
 like
  faceting over one+bi+trigrams of hundreds of millions of documents (index
  size of half a TB) --- single server with a shard of data does not seem
 to
  be enough for realtime search.
 
  This is just for a bit of a background. I agree with you on that hadoop
 and
  cloud probably best suit massive batch processes rather than realtime
  search. I'm sure, if anyone out there made SOLR shine throught the cloud
 for
  realtime search over large datasets.
 
  By SOLR on the cloud (e.g. HDFS + MR +  cloud of
  commodity machines) I mean what you've done for your customers using
 EC2.
  Any chance, the guidlines/articles for/on setting indices on HDFS are
  available in some open / paid area?
 
  To sum this up, I didn't mean to create a buzz on the cloud solutions in
  this thread, just was wondering what is practically available / going on
 in
  SOLR development in this regard.
 
  Thanks,
 
  Dmitry
 
 
  On Fri, Mar 25, 2011 at 10:28 PM, Otis Gospodnetic 
  otis_gospodne...@yahoo.com wrote:
 
  Hi Dan,
 
  This feels a bit like a buzzword soup with mushrooms. :)
 
  MR jobs, at least the ones in Hadoopland, are very batch oriented, so
 that
  wouldn't be very suitable for most search applications.  There are some
  technologies like Riak that combine MR and search.  Let me use this
 funny
  little
  link: http://lmgtfy.com/?q=riak%20mapreduce%20search
 
 
  Sure, you can put indices on HDFS (but don't expect searches to be
 fast).
   Sure
  you can create indices using MapReduce, we've done that successfully for
  customers bringing long indexing jobs from many hours to minutes by
 using,
  yes,
  a cluster of machines (actually EC2 instances).
  But when you say more into SOLR on the cloud (e.g. HDFS + MR +  cloud
 of
  commodity machines), I can't actually picture what precisely you
 mean...
 
 
  Otis
  ---
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  - Original Message 
   From: Dmitry Kan dmitry@gmail.com
   To: solr-user@lucene.apache.org
   Cc: Upayavira u...@odoko.co.uk
   Sent: Fri, March 25, 2011 8:26:33 AM
   Subject: Re: solr on the cloud
  
   Hi, Upayavira
  
   Probably I'm confusing the terms here. When I say  distributed
 faceting
  I'm
   more into SOLR on the cloud (e.g. HDFS + MR +  cloud of commodity
  machines)
   rather than into traditional multicore/sharded  SOLR on a single or
  multiple
   servers with non-distributed file systems (is  that what you mean when
  you
   refer to distribution of facet requests across  hosts?)
  
   On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk  wrote:
  
   
   
On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan  
 dmitry@gmail.com
 wrote:
 Hi Yonik,

 Oh, this is great. Is  distributed faceting available in the
 trunk?
  What
 is
  the basic server setup needed for trying this out, is it cloud
 with
  HDFS
  and
 SOLR with zookepers?
 Any chance to see the  related documentation? :)
   
Distributed faceting has been  available for a long time, and is
available in the 1.4.1  release.
   
The distribution of facet requests across hosts happens  in the
background. There's no real difference (in query syntax) between  a
standard facet query and a distributed one.
   
i.e. you  don't need SolrCloud nor Zookeeper for it. (they may
 provide
other  benefits, but you don't need them for distributed faceting).
   
 Upayavira
   
 On Fri, Mar 25, 2011 at 1:35 PM, Yonik  Seeley
 yo...@lucidimagination.comwrote:
 
  On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan 
 dmitry@gmail.com
 wrote:
   Basically, of high interest is checking out the  Map-Reduce
 for
  distributed
   faceting, is  it even possible with the trunk?
 
  Solr  already has

Re: solr on the cloud

2011-03-25 Thread Dmitry Kan
Hi Otis,

Ok, thanks.

No, the question about distributed faceting was in a 'guess' mode as
faceting seems to be a good fit to MR. I probably need to follow the jira
tickets closer for a follow-up, but was initially wondering if I missed some
documentation on the topic, which didn't apparently happen.

On Fri, Mar 25, 2011 at 5:34 AM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi,


  I have tried running the sharded solr with zoo keeper on a  single
 machine.

  The SOLR code is from current trunk. It runs nicely. Can you  please
 point me
  to a page, where I can check the status of the solr on the  cloud
 development
  and available features, apart from http://wiki.apache.org/solr/SolrCloud?

 I'm afraid that's the most comprehensive documentation so far.

  Basically, of high interest  is checking out the Map-Reduce for
 distributed
  faceting, is it even possible  with the trunk?

 Hm, MR for distributed faceting?  Maybe I missed this... can you point to a
 place that mentions this?

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/




-- 
Regards,

Dmitry Kan


Re: solr on the cloud

2011-03-25 Thread Yonik Seeley
On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com wrote:
 Basically, of high interest is checking out the Map-Reduce for distributed
 faceting, is it even possible with the trunk?

Solr already has distributed faceting, and it's much more performant
than a map-reduce implementation would be.

I've also seen a product use the term map reduce incorrectly... as in,
we map the request to each shard, and then reduce the results to a
single list (of course, that's not actually map-reduce at all ;-)

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


Re: solr on the cloud

2011-03-25 Thread Dmitry Kan
Hi Yonik,

Oh, this is great. Is distributed faceting available in the trunk? What is
the basic server setup needed for trying this out, is it cloud with HDFS and
SOLR with zookepers?
Any chance to see the related documentation? :)

On Fri, Mar 25, 2011 at 1:35 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com wrote:
  Basically, of high interest is checking out the Map-Reduce for
 distributed
  faceting, is it even possible with the trunk?

 Solr already has distributed faceting, and it's much more performant
 than a map-reduce implementation would be.

 I've also seen a product use the term map reduce incorrectly... as in,
 we map the request to each shard, and then reduce the results to a
 single list (of course, that's not actually map-reduce at all ;-)


:) this sounds pretty strange to me as well. It was only my guess, that if
you have MR as computational model and a cloud beneath it, you could
naturally map facet fields to their counts inside single documents (no
matter, where they are, be it shards or single index) and pass them onto
reducers.


 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco




-- 
Regards,

Dmitry Kan


Re: solr on the cloud

2011-03-25 Thread Upayavira


On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan dmitry@gmail.com
wrote:
 Hi Yonik,
 
 Oh, this is great. Is distributed faceting available in the trunk? What
 is
 the basic server setup needed for trying this out, is it cloud with HDFS
 and
 SOLR with zookepers?
 Any chance to see the related documentation? :)

Distributed faceting has been available for a long time, and is
available in the 1.4.1 release.

The distribution of facet requests across hosts happens in the
background. There's no real difference (in query syntax) between a
standard facet query and a distributed one.

i.e. you don't need SolrCloud nor Zookeeper for it. (they may provide
other benefits, but you don't need them for distributed faceting).

Upayavira

 On Fri, Mar 25, 2011 at 1:35 PM, Yonik Seeley
 yo...@lucidimagination.comwrote:
 
  On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com wrote:
   Basically, of high interest is checking out the Map-Reduce for
  distributed
   faceting, is it even possible with the trunk?
 
  Solr already has distributed faceting, and it's much more performant
  than a map-reduce implementation would be.
 
  I've also seen a product use the term map reduce incorrectly... as in,
  we map the request to each shard, and then reduce the results to a
  single list (of course, that's not actually map-reduce at all ;-)
 
 
 :) this sounds pretty strange to me as well. It was only my guess, that
 if
 you have MR as computational model and a cloud beneath it, you could
 naturally map facet fields to their counts inside single documents (no
 matter, where they are, be it shards or single index) and pass them
 onto
 reducers.
 
 
  -Yonik
  http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
  25-26, San Francisco
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan
 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: solr on the cloud

2011-03-25 Thread Dmitry Kan
Hi, Upayavira

Probably I'm confusing the terms here. When I say distributed faceting I'm
more into SOLR on the cloud (e.g. HDFS + MR + cloud of commodity machines)
rather than into traditional multicore/sharded SOLR on a single or multiple
servers with non-distributed file systems (is that what you mean when you
refer to distribution of facet requests across hosts?)

On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk wrote:



 On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan dmitry@gmail.com
 wrote:
  Hi Yonik,
 
  Oh, this is great. Is distributed faceting available in the trunk? What
  is
  the basic server setup needed for trying this out, is it cloud with HDFS
  and
  SOLR with zookepers?
  Any chance to see the related documentation? :)

 Distributed faceting has been available for a long time, and is
 available in the 1.4.1 release.

 The distribution of facet requests across hosts happens in the
 background. There's no real difference (in query syntax) between a
 standard facet query and a distributed one.

 i.e. you don't need SolrCloud nor Zookeeper for it. (they may provide
 other benefits, but you don't need them for distributed faceting).

 Upayavira

  On Fri, Mar 25, 2011 at 1:35 PM, Yonik Seeley
  yo...@lucidimagination.comwrote:
 
   On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com
 wrote:
Basically, of high interest is checking out the Map-Reduce for
   distributed
faceting, is it even possible with the trunk?
  
   Solr already has distributed faceting, and it's much more performant
   than a map-reduce implementation would be.
  
   I've also seen a product use the term map reduce incorrectly... as
 in,
   we map the request to each shard, and then reduce the results to a
   single list (of course, that's not actually map-reduce at all ;-)
  
  
  :) this sounds pretty strange to me as well. It was only my guess, that
  if
  you have MR as computational model and a cloud beneath it, you could
  naturally map facet fields to their counts inside single documents (no
  matter, where they are, be it shards or single index) and pass them
  onto
  reducers.
 
 
   -Yonik
   http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
   25-26, San Francisco
  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 
 ---
 Enterprise Search Consultant at Sourcesense UK,
 Making Sense of Open Source




-- 
Regards,

Dmitry Kan


Re: solr on the cloud

2011-03-25 Thread Upayavira


On Fri, 25 Mar 2011 14:26 +0200, Dmitry Kan dmitry@gmail.com
wrote:
 Hi, Upayavira
 
 Probably I'm confusing the terms here. When I say distributed faceting
 I'm
 more into SOLR on the cloud (e.g. HDFS + MR + cloud of commodity
 machines)
 rather than into traditional multicore/sharded SOLR on a single or
 multiple
 servers with non-distributed file systems (is that what you mean when you
 refer to distribution of facet requests across hosts?)

I mean the latter I am afraid. I'm very interested in how the former
might be implemented, but as far as I understand it, Zookeeper does not
take you all the way there. It co-ordinates nodes (e.g. telling a slave
where its master is), but if you have to distribute an index over
multiple hosts, it will be sharded between multiple solr hosts, with
each of those hosts having a local index.

You are presumably talking about a scenario where you effectively have
one index, spanning multiple hosts (we have code to distribute queries
across multiple segments, why can't we do it across multiple hosts?).
I've heard of work to do this with Infinispan underneath, but not within
the core Lucene/Solr.

Upayavira

 On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk wrote:
 
 
 
  On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan dmitry@gmail.com
  wrote:
   Hi Yonik,
  
   Oh, this is great. Is distributed faceting available in the trunk? What
   is
   the basic server setup needed for trying this out, is it cloud with HDFS
   and
   SOLR with zookepers?
   Any chance to see the related documentation? :)
 
  Distributed faceting has been available for a long time, and is
  available in the 1.4.1 release.
 
  The distribution of facet requests across hosts happens in the
  background. There's no real difference (in query syntax) between a
  standard facet query and a distributed one.
 
  i.e. you don't need SolrCloud nor Zookeeper for it. (they may provide
  other benefits, but you don't need them for distributed faceting).
 
  Upayavira
 
   On Fri, Mar 25, 2011 at 1:35 PM, Yonik Seeley
   yo...@lucidimagination.comwrote:
  
On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com
  wrote:
 Basically, of high interest is checking out the Map-Reduce for
distributed
 faceting, is it even possible with the trunk?
   
Solr already has distributed faceting, and it's much more performant
than a map-reduce implementation would be.
   
I've also seen a product use the term map reduce incorrectly... as
  in,
we map the request to each shard, and then reduce the results to a
single list (of course, that's not actually map-reduce at all ;-)
   
   
   :) this sounds pretty strange to me as well. It was only my guess, that
   if
   you have MR as computational model and a cloud beneath it, you could
   naturally map facet fields to their counts inside single documents (no
   matter, where they are, be it shards or single index) and pass them
   onto
   reducers.
  
  
-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco
   
  
  
  
   --
   Regards,
  
   Dmitry Kan
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan
 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: solr on the cloud

2011-03-25 Thread Otis Gospodnetic
Hi Dan,

This feels a bit like a buzzword soup with mushrooms. :)

MR jobs, at least the ones in Hadoopland, are very batch oriented, so that 
wouldn't be very suitable for most search applications.  There are some 
technologies like Riak that combine MR and search.  Let me use this funny 
little 
link: http://lmgtfy.com/?q=riak%20mapreduce%20search


Sure, you can put indices on HDFS (but don't expect searches to be fast).  Sure 
you can create indices using MapReduce, we've done that successfully for 
customers bringing long indexing jobs from many hours to minutes by using, yes, 
a cluster of machines (actually EC2 instances).
But when you say more into SOLR on the cloud (e.g. HDFS + MR +  cloud of 
commodity machines), I can't actually picture what precisely you mean...  


Otis
---
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Dmitry Kan dmitry@gmail.com
 To: solr-user@lucene.apache.org
 Cc: Upayavira u...@odoko.co.uk
 Sent: Fri, March 25, 2011 8:26:33 AM
 Subject: Re: solr on the cloud
 
 Hi, Upayavira
 
 Probably I'm confusing the terms here. When I say  distributed faceting I'm
 more into SOLR on the cloud (e.g. HDFS + MR +  cloud of commodity machines)
 rather than into traditional multicore/sharded  SOLR on a single or multiple
 servers with non-distributed file systems (is  that what you mean when you
 refer to distribution of facet requests across  hosts?)
 
 On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk  wrote:
 
 
 
  On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan  dmitry@gmail.com
   wrote:
   Hi Yonik,
  
   Oh, this is great. Is  distributed faceting available in the trunk? What
   is
the basic server setup needed for trying this out, is it cloud with HDFS
and
   SOLR with zookepers?
   Any chance to see the  related documentation? :)
 
  Distributed faceting has been  available for a long time, and is
  available in the 1.4.1  release.
 
  The distribution of facet requests across hosts happens  in the
  background. There's no real difference (in query syntax) between  a
  standard facet query and a distributed one.
 
  i.e. you  don't need SolrCloud nor Zookeeper for it. (they may provide
  other  benefits, but you don't need them for distributed faceting).
 
   Upayavira
 
   On Fri, Mar 25, 2011 at 1:35 PM, Yonik  Seeley
   yo...@lucidimagination.comwrote:
   
On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com
   wrote:
 Basically, of high interest is checking out the  Map-Reduce for
distributed
 faceting, is  it even possible with the trunk?
   
Solr  already has distributed faceting, and it's much more performant
 than a map-reduce implementation would be.
   
 I've also seen a product use the term map reduce incorrectly...  as
  in,
we map the request to each shard, and then  reduce the results to a
single list (of course, that's not  actually map-reduce at all ;-)
   
   
:) this sounds pretty strange to me as well. It was only my guess, that
if
   you have MR as computational model and a cloud beneath it,  you could
   naturally map facet fields to their counts inside single  documents (no
   matter, where they are, be it shards or single  index) and pass them
   onto
   reducers.
   
  
-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco
   
  
   
  
   --
   Regards,
  
Dmitry Kan
  
  ---
  Enterprise Search Consultant at  Sourcesense UK,
  Making Sense of Open  Source
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan
 


Re: solr on the cloud

2011-03-25 Thread Dmitry Kan
Hi Otis,

Thanks for elaborating on this and the link (funny!).

I have quite a big dataset growing all the time. The problems that I start
facing are pretty much predictable:
1. Scalability: this inludes indexing time (now some days!, better hours or
even minutes, if that's possible) along with handling the rapid growth
2. Robustness: the entire system (distributed or single server or anything
else) should be fault-tolerant, e.g. if one shard goes down, other catches
up (master-slave scheme)
3. Some apps that we run on SOLR are pretty computationally demanding, like
faceting over one+bi+trigrams of hundreds of millions of documents (index
size of half a TB) --- single server with a shard of data does not seem to
be enough for realtime search.

This is just for a bit of a background. I agree with you on that hadoop and
cloud probably best suit massive batch processes rather than realtime
search. I'm sure, if anyone out there made SOLR shine throught the cloud for
realtime search over large datasets.

By SOLR on the cloud (e.g. HDFS + MR +  cloud of
commodity machines) I mean what you've done for your customers using EC2.
Any chance, the guidlines/articles for/on setting indices on HDFS are
available in some open / paid area?

To sum this up, I didn't mean to create a buzz on the cloud solutions in
this thread, just was wondering what is practically available / going on in
SOLR development in this regard.

Thanks,

Dmitry


On Fri, Mar 25, 2011 at 10:28 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi Dan,

 This feels a bit like a buzzword soup with mushrooms. :)

 MR jobs, at least the ones in Hadoopland, are very batch oriented, so that
 wouldn't be very suitable for most search applications.  There are some
 technologies like Riak that combine MR and search.  Let me use this funny
 little
 link: http://lmgtfy.com/?q=riak%20mapreduce%20search


 Sure, you can put indices on HDFS (but don't expect searches to be fast).
  Sure
 you can create indices using MapReduce, we've done that successfully for
 customers bringing long indexing jobs from many hours to minutes by using,
 yes,
 a cluster of machines (actually EC2 instances).
 But when you say more into SOLR on the cloud (e.g. HDFS + MR +  cloud of
 commodity machines), I can't actually picture what precisely you mean...


 Otis
 ---
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org
  Cc: Upayavira u...@odoko.co.uk
  Sent: Fri, March 25, 2011 8:26:33 AM
  Subject: Re: solr on the cloud
 
  Hi, Upayavira
 
  Probably I'm confusing the terms here. When I say  distributed faceting
 I'm
  more into SOLR on the cloud (e.g. HDFS + MR +  cloud of commodity
 machines)
  rather than into traditional multicore/sharded  SOLR on a single or
 multiple
  servers with non-distributed file systems (is  that what you mean when
 you
  refer to distribution of facet requests across  hosts?)
 
  On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk  wrote:
 
  
  
   On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan  dmitry@gmail.com
wrote:
Hi Yonik,
   
Oh, this is great. Is  distributed faceting available in the trunk?
 What
is
 the basic server setup needed for trying this out, is it cloud with
 HDFS
 and
SOLR with zookepers?
Any chance to see the  related documentation? :)
  
   Distributed faceting has been  available for a long time, and is
   available in the 1.4.1  release.
  
   The distribution of facet requests across hosts happens  in the
   background. There's no real difference (in query syntax) between  a
   standard facet query and a distributed one.
  
   i.e. you  don't need SolrCloud nor Zookeeper for it. (they may provide
   other  benefits, but you don't need them for distributed faceting).
  
Upayavira
  
On Fri, Mar 25, 2011 at 1:35 PM, Yonik  Seeley
yo...@lucidimagination.comwrote:

 On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com
wrote:
  Basically, of high interest is checking out the  Map-Reduce for
 distributed
  faceting, is  it even possible with the trunk?

 Solr  already has distributed faceting, and it's much more
 performant
  than a map-reduce implementation would be.

  I've also seen a product use the term map reduce incorrectly...
  as
   in,
 we map the request to each shard, and then  reduce the results
 to a
 single list (of course, that's not  actually map-reduce at all ;-)


 :) this sounds pretty strange to me as well. It was only my guess,
 that
 if
you have MR as computational model and a cloud beneath it,  you could
naturally map facet fields to their counts inside single  documents
 (no
matter, where they are, be it shards or single  index) and pass
 them
onto
reducers.

   
 -Yonik

Re: solr on the cloud

2011-03-25 Thread Jason Rutherglen
Dmitry,

If you're planning on using HBase you can take a look at
https://issues.apache.org/jira/browse/HBASE-3529  I think we may even
have a reasonable solution for reading the index [randomly] out of
HDFS.  Benchmarking'll be implemented next.  It's not production
ready, suggestions are welcome.

Jason

On Fri, Mar 25, 2011 at 2:03 PM, Dmitry Kan dmitry@gmail.com wrote:
 Hi Otis,

 Thanks for elaborating on this and the link (funny!).

 I have quite a big dataset growing all the time. The problems that I start
 facing are pretty much predictable:
 1. Scalability: this inludes indexing time (now some days!, better hours or
 even minutes, if that's possible) along with handling the rapid growth
 2. Robustness: the entire system (distributed or single server or anything
 else) should be fault-tolerant, e.g. if one shard goes down, other catches
 up (master-slave scheme)
 3. Some apps that we run on SOLR are pretty computationally demanding, like
 faceting over one+bi+trigrams of hundreds of millions of documents (index
 size of half a TB) --- single server with a shard of data does not seem to
 be enough for realtime search.

 This is just for a bit of a background. I agree with you on that hadoop and
 cloud probably best suit massive batch processes rather than realtime
 search. I'm sure, if anyone out there made SOLR shine throught the cloud for
 realtime search over large datasets.

 By SOLR on the cloud (e.g. HDFS + MR +  cloud of
 commodity machines) I mean what you've done for your customers using EC2.
 Any chance, the guidlines/articles for/on setting indices on HDFS are
 available in some open / paid area?

 To sum this up, I didn't mean to create a buzz on the cloud solutions in
 this thread, just was wondering what is practically available / going on in
 SOLR development in this regard.

 Thanks,

 Dmitry


 On Fri, Mar 25, 2011 at 10:28 PM, Otis Gospodnetic 
 otis_gospodne...@yahoo.com wrote:

 Hi Dan,

 This feels a bit like a buzzword soup with mushrooms. :)

 MR jobs, at least the ones in Hadoopland, are very batch oriented, so that
 wouldn't be very suitable for most search applications.  There are some
 technologies like Riak that combine MR and search.  Let me use this funny
 little
 link: http://lmgtfy.com/?q=riak%20mapreduce%20search


 Sure, you can put indices on HDFS (but don't expect searches to be fast).
  Sure
 you can create indices using MapReduce, we've done that successfully for
 customers bringing long indexing jobs from many hours to minutes by using,
 yes,
 a cluster of machines (actually EC2 instances).
 But when you say more into SOLR on the cloud (e.g. HDFS + MR +  cloud of
 commodity machines), I can't actually picture what precisely you mean...


 Otis
 ---
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org
  Cc: Upayavira u...@odoko.co.uk
  Sent: Fri, March 25, 2011 8:26:33 AM
  Subject: Re: solr on the cloud
 
  Hi, Upayavira
 
  Probably I'm confusing the terms here. When I say  distributed faceting
 I'm
  more into SOLR on the cloud (e.g. HDFS + MR +  cloud of commodity
 machines)
  rather than into traditional multicore/sharded  SOLR on a single or
 multiple
  servers with non-distributed file systems (is  that what you mean when
 you
  refer to distribution of facet requests across  hosts?)
 
  On Fri, Mar 25, 2011 at 1:57 PM, Upayavira u...@odoko.co.uk  wrote:
 
  
  
   On Fri, 25 Mar 2011 13:44 +0200, Dmitry Kan  dmitry@gmail.com
    wrote:
Hi Yonik,
   
Oh, this is great. Is  distributed faceting available in the trunk?
 What
is
 the basic server setup needed for trying this out, is it cloud with
 HDFS
     and
SOLR with zookepers?
Any chance to see the  related documentation? :)
  
   Distributed faceting has been  available for a long time, and is
   available in the 1.4.1  release.
  
   The distribution of facet requests across hosts happens  in the
   background. There's no real difference (in query syntax) between  a
   standard facet query and a distributed one.
  
   i.e. you  don't need SolrCloud nor Zookeeper for it. (they may provide
   other  benefits, but you don't need them for distributed faceting).
  
    Upayavira
  
On Fri, Mar 25, 2011 at 1:35 PM, Yonik  Seeley
yo...@lucidimagination.comwrote:
    
 On Tue, Mar 22, 2011 at 7:51 AM, Dmitry Kan dmitry@gmail.com
    wrote:
  Basically, of high interest is checking out the  Map-Reduce for
 distributed
  faceting, is  it even possible with the trunk?

 Solr  already has distributed faceting, and it's much more
 performant
  than a map-reduce implementation would be.

  I've also seen a product use the term map reduce incorrectly...
  as
   in,
 we map the request to each shard, and then  reduce the results
 to a
 single list (of course

Re: solr on the cloud

2011-03-24 Thread Otis Gospodnetic
Hi,


 I have tried running the sharded solr with zoo keeper on a  single machine.

 The SOLR code is from current trunk. It runs nicely. Can you  please point me
 to a page, where I can check the status of the solr on the  cloud development
 and available features, apart from http://wiki.apache.org/solr/SolrCloud ?

I'm afraid that's the most comprehensive documentation so far.

 Basically, of high interest  is checking out the Map-Reduce for distributed
 faceting, is it even possible  with the trunk?

Hm, MR for distributed faceting?  Maybe I missed this... can you point to a 
place that mentions this?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


solr on the cloud

2011-03-22 Thread Dmitry Kan
hey folks,

I have tried running the sharded solr with zoo keeper on a single machine.
The SOLR code is from current trunk. It runs nicely. Can you please point me
to a page, where I can check the status of the solr on the cloud development
and available features, apart from http://wiki.apache.org/solr/SolrCloud ?

Basically, of high interest is checking out the Map-Reduce for distributed
faceting, is it even possible with the trunk?

-- 
Regards,

Dmitry Kan


git repo for branch_3x + SOLR-1873 (Solr Cloud)

2010-11-22 Thread Jeremy Hinegardner
Hi all,

I've done an initial backport of SOLR-1873 (Solr Cloud) to branch_3x.  I will do
merges from branch_3x periodically.  Currently this passes all tests.  

https://github.com/collectiveintellect/lucene-solr/tree/branch_3x-cloud

We need a stable Solr Cloud system and this was our best guess on how that
should be done.  Does that sound right?

enjoy,

-jeremy

-- 

 Jeremy Hinegardner  jer...@hinegardner.org 



Re: Status of Solr in the cloud?

2010-08-27 Thread Markus Jelsma
That would be Solr 4.0, or maybe 3.1 first.

http://wiki.apache.org/solr/Solr3.1
http://wiki.apache.org/solr/Solr4.0


On Thursday 26 August 2010 23:58:25 Charlie Jackson wrote:
 There seem to be a few parallel efforts at putting Solr in a cloud
 configuration. See http://wiki.apache.org/solr/KattaIntegration, which
 is based off of https://issues.apache.org/jira/browse/SOLR-1395. Also
 http://wiki.apache.org/solr/SolrCloud which is
 https://issues.apache.org/jira/browse/SOLR-1873. And another JIRA:
 https://issues.apache.org/jira/browse/SOLR-1301.
 
 
 
 These all seem aimed at the same goal, correct? I'm interested in
 evaluating one of these solutions for my company; which is the most
 stable or most likely to eventually be part of the Solr distribution?
 
 
 
 
 
 Thanks,
 
 Charlie
 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350



Status of Solr in the cloud?

2010-08-26 Thread Charlie Jackson
There seem to be a few parallel efforts at putting Solr in a cloud
configuration. See http://wiki.apache.org/solr/KattaIntegration, which
is based off of https://issues.apache.org/jira/browse/SOLR-1395. Also
http://wiki.apache.org/solr/SolrCloud which is
https://issues.apache.org/jira/browse/SOLR-1873. And another JIRA:
https://issues.apache.org/jira/browse/SOLR-1301. 

 

These all seem aimed at the same goal, correct? I'm interested in
evaluating one of these solutions for my company; which is the most
stable or most likely to eventually be part of the Solr distribution?

 

 

Thanks,

Charlie