Re: maxBooleanClauses change in solr.xml not reflecting in solr 8.4.1
Thanks Hoss, Yes, i was making the change in solr.xml in wrong directory earlier. Also as you said: : You need to update EVERY solrconfig.xml that the JVM is loading for this to : actually work. that has not been true for a while, see SOLR-13336 / SOLR-10921 ... I validated this and it's working as expected. We don't need to update every solrconfig.xml. The value mentioned in solr.xml is global and if maxBooleanClauses for any collection in solrconfig.xml exceeds the limit specified in solr.xml then we get the exception. Thanks for replying. On Wed, Jan 6, 2021 at 10:57 PM dinesh naik wrote: > Thanks Shawn, > > This entry ${sol > r.max.booleanClauses:2048} in solr.xml was introduced only in solr > 8.x version and were not present in 7.6 version. > > We have this in solrconfig.xml in 8.4.1 version. > ${solr.max.booleanClauses:2048} maxBooleanClauses> > i was updating the solr.xml in the installation directory and not the > installed data directory, hence the change was not reflecting. > After updating the correct solr.xml and restarting the Solr nodes the new > value is working as expected. > > On Wed, Jan 6, 2021 at 10:34 PM Chris Hostetter > wrote: > >> >> : You need to update EVERY solrconfig.xml that the JVM is loading for >> this to >> : actually work. >> >> that has not been true for a while, see SOLR-13336 / SOLR-10921 ... >> >> : > 2. updated solr.xml : >> : > ${solr.max.booleanClauses:2048} >> : >> : I don't think it's currently possible to set the value with solr.xml. >> >> Not only is it possible, it's neccessary -- the value in solr.xml acts as >> a hard upper limit (and affects all queries, even internally expanded >> queries) on the "soft limit" in solrconfig.xml (that only affects >> explicitly supplied boolean queries from users) >> >> As to the original question... >> >> > 2021-01-05 14:03:59.603 WARN (qtp1545077099-27) >> x:col1_shard1_replica_n3 >> > o.a.s.c.SolrConfig solrconfig.xml: of 2048 is >> greater >> > than global limit of 1024 and will have no effect >> >> I attempted to reproduce this with 8.4.1 and did not see the probem you >> are describing. >> >> Are you 100% certain you are updating the correct solr.xml file? If you >> add some non-xml giberish to the solr.xml you are editing does the solr >> node fail to start up? >> >> Remember that when using SolrCloud, solr will try to load solr.xml from >> zk >> first, and only look on local disk if it can't be found in ZK ... look >> for >> log messages like "solr.xml found in ZooKeeper. Loading..." vs "Loading >> solr.xml from SolrHome (not found in ZooKeeper)" >> >> >> >> >> -Hoss >> http://www.lucidworks.com/ >> > > > -- > Best Regards, > Dinesh Naik > -- Best Regards, Dinesh Naik
Re: maxBooleanClauses change in solr.xml not reflecting in solr 8.4.1
Thanks Shawn, This entry ${solr.max.booleanClauses:2048} in solr.xml was introduced only in solr 8.x version and were not present in 7.6 version. We have this in solrconfig.xml in 8.4.1 version. ${solr.max.booleanClauses:2048} i was updating the solr.xml in the installation directory and not the installed data directory, hence the change was not reflecting. After updating the correct solr.xml and restarting the Solr nodes the new value is working as expected. On Wed, Jan 6, 2021 at 10:34 PM Chris Hostetter wrote: > > : You need to update EVERY solrconfig.xml that the JVM is loading for this > to > : actually work. > > that has not been true for a while, see SOLR-13336 / SOLR-10921 ... > > : > 2. updated solr.xml : > : > ${solr.max.booleanClauses:2048} > : > : I don't think it's currently possible to set the value with solr.xml. > > Not only is it possible, it's neccessary -- the value in solr.xml acts as > a hard upper limit (and affects all queries, even internally expanded > queries) on the "soft limit" in solrconfig.xml (that only affects > explicitly supplied boolean queries from users) > > As to the original question... > > > 2021-01-05 14:03:59.603 WARN (qtp1545077099-27) x:col1_shard1_replica_n3 > > o.a.s.c.SolrConfig solrconfig.xml: of 2048 is greater > > than global limit of 1024 and will have no effect > > I attempted to reproduce this with 8.4.1 and did not see the probem you > are describing. > > Are you 100% certain you are updating the correct solr.xml file? If you > add some non-xml giberish to the solr.xml you are editing does the solr > node fail to start up? > > Remember that when using SolrCloud, solr will try to load solr.xml from zk > first, and only look on local disk if it can't be found in ZK ... look for > log messages like "solr.xml found in ZooKeeper. Loading..." vs "Loading > solr.xml from SolrHome (not found in ZooKeeper)" > > > > > -Hoss > http://www.lucidworks.com/ > -- Best Regards, Dinesh Naik
maxBooleanClauses change in solr.xml not reflecting in solr 8.4.1
Hi all, I want to update the maxBooleanClauses to 2048 (from default value 1024). Below are the steps tried: 1. updated solrconfig.xml : ${solr.max.booleanClauses:2048} 2. updated solr.xml : ${solr.max.booleanClauses:2048} 3. Restarted the solr nodes. 4. Tried query with more than 2000 OR clauses and getting below waring message in solr logs: 2021-01-05 14:03:59.603 WARN (qtp1545077099-27) x:col1_shard1_replica_n3 o.a.s.c.SolrConfig solrconfig.xml: of 2048 is greater than global limit of 1024 and will have no effect 2021-01-05 14:03:59.603 WARN (qtp1545077099-27) x:col1_shard1_replica_n3 o.a.s.c.SolrConfig set 'maxBooleanClauses' in solr.xml to increase global limit Note: In 7.6.1 version we just need to change the solrconfig.xml and it works. Kindly let me know if i am missing something for making it work in 8.4.1 version. -- Best Regards, Dinesh Naik
Queries on adding headers to solrj Request
Hi all, We are planning to add security to Solr using . For this we are adding few information in the headers of each SolrJ Request. These request will be intercepted by some application (proxy) in the Solr VM and then route it to Solr ( Considering Solr port as 8983 ) . Could you please answer below queries: 1. Are there any API ( Path ) that Solr Client cannot access and only Solr uses for Intra node communication? 2. As the SolrJ client will add headers, Intra communication from Solr also needs to add these headers ( like ping request from Solr1 Node to Solr2 Node ). Could Solr add custom headers for intra node communication? 3. Apart from 8983 node, are there any other ports Solr is using for intra node communication? 4. how to add headers to CloudSolrClient ? -- Best Regards, Dinesh Naik
Re: Solr 7.6.0: PingRequestHandler - Changing the default query (*:*)
Hi Erick, Each vm has 128GB of physical memory. On Mon, Aug 5, 2019, 8:38 PM Erick Erickson wrote: > How much total physical memory on your machine? Lucene holds a lot of the > index in MMapDirectory space. My starting point is to allocate no more than > 50% of my physical memory to the Java heap. You’re allocating 31G, if you > don’t > have at _least_ 64G on these machines you’re probably swapping. > > See: > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Best, > Erick > > > > On Aug 5, 2019, at 10:58 AM, dinesh naik > wrote: > > > > Hi Shawn, > > yes i am running solr in cloud mode and Even after adding the params > row=0 > > and distrib=false, the query response is more than 15 sec due to more > than > > a billion doc set. > > Also the soft commit setting can not be changed to a higher no. due to > > requirement from business team. > > > > > http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false > > takes more than 10 sec always. > > > > Here are the java heap and G1GC setting i have , > > > > /usr/java/default/bin/java -server -Xmx31g -Xms31g -XX:+UseG1GC > > -XX:MaxGCPauseMillis=250 -XX:ConcGCThreads=5 > > -XX:ParallelGCThreads=10 -XX:+UseLargePages -XX:+AggressiveOpts > > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled > > -XX:InitiatingHeapOccupancyPercent=50 -XX:G1ReservePercent=18 > > -XX:MaxNewSize=6G -XX:PrintFLSStatistics=1 > > -XX:+PrintPromotionFailure -XX:+HeapDumpOnOutOfMemoryError > > -XX:HeapDumpPath=/solr7/logs/heapdump > > -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps > > -XX:+PrintGCTimeStamps > > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime > > > > JVM heap has never crossed 20GB in my setup , also Young G1GC timing is > > well within milli seconds (in range of 25-200 ms). > > > > On Mon, Aug 5, 2019 at 6:37 PM Shawn Heisey wrote: > > > >> On 8/4/2019 10:15 PM, dinesh naik wrote: > >>> My question is regarding the custom query being used. Here i am > querying > >>> for field _root_ which is available in all of my cluster and defined > as a > >>> string field. The result for _root_:abc might not get me any match as > >>> well(i am ok with not finding any matches, the query should not be > taking > >>> 10-15 seconds for getting the response). > >> > >> Typically the *:* query is the fastest option. It is special syntax > >> that means "all documents" and it usually executes very quickly. It > >> will be faster than querying for a value in a specific field, which is > >> what you have defined currently. > >> > >> I will typically add a "rows" parameter to the ping handler with a value > >> of 1, so Solr will not be retrieving a large amount of data. If you are > >> running Solr in cloud mode, you should experiment with setting the > >> distrib parameter to false, which will hopefully limit the query to the > >> receiving node only. > >> > >> Erick has already mentioned GC pauses as a potential problem. With a > >> 10-15 second response time, I think that has high potential to be the > >> underlying cause. > >> > >> The response you included at the beginning of the thread indicates there > >> are 1.3 billion documents, which is going to require a fair amount of > >> heap memory. If seeing such long ping times with a *:* query is > >> something that happens frequently, your heap may be too small, which > >> will cause frequent full garbage collections. > >> > >> The very low autoSoftCommit time can contribute to system load. I think > >> it's very likely, especially with such a large index, that in many cases > >> those automatic commits are taking far longer than 5 seconds to > >> complete. If that's the case, you're not achieving a 5 second > >> visibility interval and you are putting a lot of load on Solr, so I > >> would consider increasing it. > >> > >> Thanks, > >> Shawn > >> > > > > > > -- > > Best Regards, > > Dinesh Naik > >
Re: Solr 7.6.0: PingRequestHandler - Changing the default query (*:*)
Hi Shawn, yes i am running solr in cloud mode and Even after adding the params row=0 and distrib=false, the query response is more than 15 sec due to more than a billion doc set. Also the soft commit setting can not be changed to a higher no. due to requirement from business team. http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false takes more than 10 sec always. Here are the java heap and G1GC setting i have , /usr/java/default/bin/java -server -Xmx31g -Xms31g -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:ConcGCThreads=5 -XX:ParallelGCThreads=10 -XX:+UseLargePages -XX:+AggressiveOpts -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:InitiatingHeapOccupancyPercent=50 -XX:G1ReservePercent=18 -XX:MaxNewSize=6G -XX:PrintFLSStatistics=1 -XX:+PrintPromotionFailure -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/solr7/logs/heapdump -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime JVM heap has never crossed 20GB in my setup , also Young G1GC timing is well within milli seconds (in range of 25-200 ms). On Mon, Aug 5, 2019 at 6:37 PM Shawn Heisey wrote: > On 8/4/2019 10:15 PM, dinesh naik wrote: > > My question is regarding the custom query being used. Here i am querying > > for field _root_ which is available in all of my cluster and defined as a > > string field. The result for _root_:abc might not get me any match as > > well(i am ok with not finding any matches, the query should not be taking > > 10-15 seconds for getting the response). > > Typically the *:* query is the fastest option. It is special syntax > that means "all documents" and it usually executes very quickly. It > will be faster than querying for a value in a specific field, which is > what you have defined currently. > > I will typically add a "rows" parameter to the ping handler with a value > of 1, so Solr will not be retrieving a large amount of data. If you are > running Solr in cloud mode, you should experiment with setting the > distrib parameter to false, which will hopefully limit the query to the > receiving node only. > > Erick has already mentioned GC pauses as a potential problem. With a > 10-15 second response time, I think that has high potential to be the > underlying cause. > > The response you included at the beginning of the thread indicates there > are 1.3 billion documents, which is going to require a fair amount of > heap memory. If seeing such long ping times with a *:* query is > something that happens frequently, your heap may be too small, which > will cause frequent full garbage collections. > > The very low autoSoftCommit time can contribute to system load. I think > it's very likely, especially with such a large index, that in many cases > those automatic commits are taking far longer than 5 seconds to > complete. If that's the case, you're not achieving a 5 second > visibility interval and you are putting a lot of load on Solr, so I > would consider increasing it. > > Thanks, > Shawn > -- Best Regards, Dinesh Naik
Re: Solr 7.6.0: PingRequestHandler - Changing the default query (*:*)
Hi Nikolas, The restart of node is not helping , the node keeps trying to recover and always fails: here is the log : 2019-07-31 06:10:08.049 INFO (coreZkRegister-1-thread-1-processing-n:replica_host:8983_solr x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698) x:parts_shard30_replica_n2697 o.a.s.c.ZkController Core needs to recover:parts_shard30_replica_n2697 2019-07-31 06:10:08.050 INFO (updateExecutor-3-thread-1-processing-n:replica_host:8983_solr x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698) x:parts_shard30_replica_n2697 o.a.s.u.DefaultSolrCoreState Running recovery 2019-07-31 06:10:08.056 INFO (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698) x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy Starting recovery process. recoveringAfterStartup=true 2019-07-31 06:10:08.261 INFO (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698) x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy startupVersions size=49956 range=[1640550593276674048 to 1640542396328443904] 2019-07-31 06:10:08.328 INFO (qtp689401025-58) o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/key params={omitHeader=true&wt=json} status=0 QTime=0 2019-07-31 06:10:09.276 INFO (recoveryExecutor-4-thread-1-processing-n:replica_host:8983_solr x:parts_shard30_replica_n2697 c:parts s:shard30 r:core_node2698) x:parts_shard30_replica_n2697 o.a.s.c.RecoveryStrategy Failed to connect leader http://hostname:8983/solr on recovery, try again The ping request query is being called from solr itself and not via some script,so there is no way to stop it . code where the time is hardcoded to 1 sec: try (HttpSolrClient httpSolrClient = new HttpSolrClient.Builder(leaderReplica.getCoreUrl()) .withSocketTimeout(1000) .withConnectionTimeout(1000) .withHttpClient(cc.getUpdateShardHandler().getRecoveryOnlyHttpClient()) .build()) { SolrPingResponse resp = httpSolrClient.ping(); return leaderReplica; } catch (IOException e) { log.info("Failed to connect leader {} on recovery, try again", leaderReplica.getBaseUrl()); Thread.sleep(500); } catch (Exception e) { if (e.getCause() instanceof IOException) { log.info("Failed to connect leader {} on recovery, try again", leaderReplica.getBaseUrl()); Thread.sleep(500); } else { return leaderReplica; } } On Mon, Aug 5, 2019 at 1:19 PM Nicolas Franck wrote: > If the ping request handler is taking too long, > and the server is not recovering automatically, > there is not much you can do automatically on that server. > You have to intervene manually, and restart Solr on that node. > > First of all: the ping is just an internal check. If it takes too long > to respond, the requester (i.e. the script calling it), should stop > the request, and mark that node as problematic. If there are > for example memory problems every subsequent request will only enhance > the problem, and Solr cannot recover from that. > > > On 5 Aug 2019, at 06:15, dinesh naik wrote: > > > > Thanks john,Erick and Furknan. > > > > I have already defined the ping request handler in solrconfig.xml as > below: > > > name="invariants"> /select _root_:abc > > > > My question is regarding the custom query being used. Here i am querying > > for field _root_ which is available in all of my cluster and defined as a > > string field. The result for _root_:abc might not get me any match as > > well(i am ok with not finding any matches, the query should not be taking > > 10-15 seconds for getting the response). > > > > If the response comes within 1 second , then the core recovery issue is > > solved, hence need your suggestion if using _root_ field in custom query > is > > fine? > > > > > > On Mon, Aug 5, 2019 at 2:49 AM Furkan KAMACI > wrote: > > > >> Hi, > >> > >> You can change invariants i.e. *qt* and *q* of a *PingRequestHandler*: > >> > >> > >> > >> /search > >> some test query > >> > >> > >> > >> Check documentation fore more info: > >> > >> > https://lucene.apache.org/solr/7_6_0//solr-core/org/apache/solr/handler/PingRequestHandler.html > >> > >> Kind Regards, > >> Furkan KAMACI > >> > >> On Sat, Aug 3, 2019 at 4:17 PM Erick Erickson > >> wrote: > >> > >>> You can also (I think) explicitly define the ping request handler in > >>> solrconfig.xml to do something else. > &g
Re: Solr 7.6.0: PingRequestHandler - Changing the default query (*:*)
Thanks john,Erick and Furknan. I have already defined the ping request handler in solrconfig.xml as below: /select _root_:abc My question is regarding the custom query being used. Here i am querying for field _root_ which is available in all of my cluster and defined as a string field. The result for _root_:abc might not get me any match as well(i am ok with not finding any matches, the query should not be taking 10-15 seconds for getting the response). If the response comes within 1 second , then the core recovery issue is solved, hence need your suggestion if using _root_ field in custom query is fine? On Mon, Aug 5, 2019 at 2:49 AM Furkan KAMACI wrote: > Hi, > > You can change invariants i.e. *qt* and *q* of a *PingRequestHandler*: > > > > /search > some test query > > > > Check documentation fore more info: > > https://lucene.apache.org/solr/7_6_0//solr-core/org/apache/solr/handler/PingRequestHandler.html > > Kind Regards, > Furkan KAMACI > > On Sat, Aug 3, 2019 at 4:17 PM Erick Erickson > wrote: > > > You can also (I think) explicitly define the ping request handler in > > solrconfig.xml to do something else. > > > > > On Aug 2, 2019, at 9:50 AM, Jörn Franke wrote: > > > > > > Not sure if this is possible, but why not create a query handler in > Solr > > with any custom query and you use that as ping replacement ? > > > > > >> Am 02.08.2019 um 15:48 schrieb dinesh naik >: > > >> > > >> Hi all, > > >> I have few clusters with huge data set and whenever a node goes down > its > > >> not able to recover due to below reasons: > > >> > > >> 1. ping request handler is taking more than 10-15 seconds to respond. > > The > > >> ping requesthandler however, expects it will return in less than 1 > > second > > >> and fails a requestrecovery if it is not responded to in this time. > > >> Therefore recoveries never would start. > > >> > > >> 2. soft commit is very low ie. 5 sec. This is a business requirement > so > > >> not much can be done here. > > >> > > >> As the standard/default admin/ping request handler is using *:* > queries > > , > > >> the response time is much higher, and i am looking for an option to > > change > > >> the same so that the ping handler returns the results within few > > >> miliseconds. > > >> > > >> here is an example for standard query time: > > >> > > >> snip--- > > >> curl " > > >> > > > http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false&debug=timing > > >> " > > >> { > > >> "responseHeader":{ > > >> "zkConnected":true, > > >> "status":0, > > >> "QTime":16620, > > >> "params":{ > > >> "q":"*:*", > > >> "distrib":"false", > > >> "debug":"timing", > > >> "indent":"on", > > >> "rows":"0", > > >> "wt":"json"}}, > > >> "response":{"numFound":1329638799,"start":0,"docs":[] > > >> }, > > >> "debug":{ > > >> "timing":{ > > >> "time":16620.0, > > >> "prepare":{ > > >> "time":0.0, > > >> "query":{ > > >> "time":0.0}, > > >> "facet":{ > > >> "time":0.0}, > > >> "facet_module":{ > > >> "time":0.0}, > > >> "mlt":{ > > >> "time":0.0}, > > >> "highlight":{ > > >> "time":0.0}, > > >> "stats":{ > > >> "time":0.0}, > > >> "expand":{ > > >> "time":0.0}, > > >> "terms":{ > > >> "time":0.0}, > > >> "block-expensive-queries":{ > > >> "time":0.0}, > > >> "slow-query-logger":{ > > >> "time":0.0}, > > >> "debug"
Solr 7.6.0: PingRequestHandler - Changing the default query (*:*)
Hi all, I have few clusters with huge data set and whenever a node goes down its not able to recover due to below reasons: 1. ping request handler is taking more than 10-15 seconds to respond. The ping requesthandler however, expects it will return in less than 1 second and fails a requestrecovery if it is not responded to in this time. Therefore recoveries never would start. 2. soft commit is very low ie. 5 sec. This is a business requirement so not much can be done here. As the standard/default admin/ping request handler is using *:* queries , the response time is much higher, and i am looking for an option to change the same so that the ping handler returns the results within few miliseconds. here is an example for standard query time: snip--- curl " http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false&debug=timing " { "responseHeader":{ "zkConnected":true, "status":0, "QTime":16620, "params":{ "q":"*:*", "distrib":"false", "debug":"timing", "indent":"on", "rows":"0", "wt":"json"}}, "response":{"numFound":1329638799,"start":0,"docs":[] }, "debug":{ "timing":{ "time":16620.0, "prepare":{ "time":0.0, "query":{ "time":0.0}, "facet":{ "time":0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ "time":0.0}, "terms":{ "time":0.0}, "block-expensive-queries":{ "time":0.0}, "slow-query-logger":{ "time":0.0}, "debug":{ "time":0.0}}, "process":{ "time":16619.0, "query":{ "time":16619.0}, "facet":{ "time":0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ "time":0.0}, "terms":{ "time":0.0}, "block-expensive-queries":{ "time":0.0}, "slow-query-logger":{ "time":0.0}, "debug":{ "time":0.0} snap can we use query: _root_:abc in the ping request handler ? Tried this query and its returning the results within few miliseconds and also the nodes are able to recover without any issue. we want to use _root_ field for querying as this field is available in all our clusters with below definition: Could you please let me know if using _root_ for querying in pingRequestHandler will cause any problem? /select _root_:abc -- Best Regards, Dinesh Naik
SSL in Solr 7.6.0
Hi all, I am working on securing Solr and Client communication by implementing SSL for a multi node cluster(100+). The client are connecting to Solr via CloudSolrClient through Zoo keeper and i am looking for best way to create the certificate for making the connection secured. for a cluster of size 100 and plus, it becomes hard to have all the hostnames/ip's while generating the certificate and wildcard option is ruled out due tp security concerns, so what is the best way to handle this scenario. Also if you give some light on usage of SOLR_SSL_CHECK_PEER_NAME param and if that will help in any way ? -- Best Regards, Dinesh Naik
Re: Integrate nutch with solr
Thanks Shawn for the reply, yes I do have some questions on the solr too. can you please share the steps for solr side to integate the nutch or no steps are needed in solr? On Thu, Oct 18, 2018 at 8:35 PM Shawn Heisey wrote: > On 10/18/2018 12:35 PM, Dinesh Sundaram wrote: > > Can you please share the steps to integrate nutch 2.3.1 with solrcloud > > 7.1.0. > > You will need to speak to the nutch project about how to configure their > software to interact with Solr. If you have questions about Solr > itself, we can answer those. > > http://nutch.apache.org/mailing_lists.html > > Thanks, > Shawn > >
Integrate nutch with solr
Hi Team, Can you please share the steps to integrate nutch 2.3.1 with solrcloud 7.1.0. Thanks, Dinesh Sundaram
solr allow read permission to anonymous/guest user
Hi, Is there any option to allow read permissions to anonymous/guest user? expecting to prompt credentials only if any update or delete operations. Thanks, Dinesh Sundaram
Re: solr basic authentication
Thanks Chris for your help. I tried to find that solution but nothing is working out. it is not accepting the credentils, may be i'm trying with wrong base 64 algorithm. On Thu, Jun 21, 2018 at 12:25 PM, Christopher Schultz < ch...@christopherschultz.net> wrote: > Dinesh, > > On 6/21/18 11:40 AM, Dinesh Sundaram wrote: > > is there any way to disable basic authentication for particular domain. i > > have proxy pass from a domain to solr which is always asking credentials > so > > wanted to disable basic auth only for that domain. is there any way? > > I wouldn't recommend this, in general, because it's not really all that > secure, but since you have a reverse-proxy in between the client and > Solr, why not have the proxy provide the HTTP BASIC authentication > information to Solr? > > That may be a more straightforward solution. > > -chris > >
Re: Solr basic auth
yes, Thanks. I would like to whitelist one domain from the authlogin. On Thu, Jun 21, 2018 at 12:57 PM, Jan Høydahl wrote: > Hi, > > As I said there is not way to combine multiple authentication plugins at > the moment. > So your best shot is probably to create your own CustomAuthPlugin where you > implement the logic that you need. You can fork the code from BasicAuth > and add > the logic you need to whitelist the requests you need. > > It is very hard to understand from your initial email how Solr should see > the > difference between a request to "the solr URL directly" vs requests done > indirectly, whatever that would mean. From solr's standpoint ALL requests > are done to Solr directly :-) I suppose you mean that if a request > originates > from a particular frontend server's IP address then it should be > whitelisted? > > You could also suggest in a new JIRA issue to extend Solr's auth feature > to allow a chain of AuthPlugins, and if the request passes any of them it > is let through. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 21. jun. 2018 kl. 17:47 skrev Dinesh Sundaram : > > > > thanks for your valuable feedback. I really want to allow this domain > > without any credentials. i need basic auth only if anyone access the solr > > url directly. so no option in solr to do that? > > > > On Sun, Jun 17, 2018 at 4:18 PM, Jan Høydahl > wrote: > > > >> Of course, but Dinesh explicitly set blockUnknown=true below, so in this > >> case ALL requests must have credentials. There is currently no feature > that > >> lets Solr accept any request by other rules, all requests are forwarded > to > >> the chosen authentication plugin. > >> > >> -- > >> Jan Høydahl, search solution architect > >> Cominvent AS - www.cominvent.com > >> > >>> 15. jun. 2018 kl. 19:12 skrev Terry Steichen : > >>> > >>> "When authentication is enabled ALL requests must carry valid > >>> credentials." I believe this behavior depends on the value you set for > >>> the *blockUnknown* authentication parameter. > >>> > >>> > >>> On 06/15/2018 06:25 AM, Jan Høydahl wrote: > >>>> When authentication is enabled ALL requests must carry valid > >> credentials. > >>>> > >>>> Are you asking for a feature where a request is authenticated based on > >> IP address of the client, not username/password? > >>>> > >>>> Jan > >>>> > >>>> Sendt fra min iPhone > >>>> > >>>>> 14. jun. 2018 kl. 22:24 skrev Dinesh Sundaram < > sdineshros...@gmail.com > >>> : > >>>>> > >>>>> Hi, > >>>>> > >>>>> I have configured basic auth for solrcloud. it works well when i > >> access the > >>>>> solr url directly. i have integrated this solr with test.com domain. > >> now if > >>>>> I access the solr url like test.com/solr it prompts the credentials > >> but I > >>>>> dont want to ask this time since it is known domain. is there any way > >> to > >>>>> achieve this. much appreciate your quick response. > >>>>> > >>>>> my security json below. i'm using the default security, want to allow > >> my > >>>>> domain default without prompting any credentials. > >>>>> > >>>>> {"authentication":{ > >>>>> "blockUnknown": true, > >>>>> "class":"solr.BasicAuthPlugin", > >>>>> "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= > >>>>> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="} > >>>>> },"authorization":{ > >>>>> "class":"solr.RuleBasedAuthorizationPlugin", > >>>>> "permissions":[{"name":"security-edit", > >>>>>"role":"admin"}], > >>>>> "user-role":{"solr":"admin"} > >>>>> }} > >>> > >> > >> > >
Re: Solr basic auth
thanks for your valuable feedback. I really want to allow this domain without any credentials. i need basic auth only if anyone access the solr url directly. so no option in solr to do that? On Sun, Jun 17, 2018 at 4:18 PM, Jan Høydahl wrote: > Of course, but Dinesh explicitly set blockUnknown=true below, so in this > case ALL requests must have credentials. There is currently no feature that > lets Solr accept any request by other rules, all requests are forwarded to > the chosen authentication plugin. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 15. jun. 2018 kl. 19:12 skrev Terry Steichen : > > > > "When authentication is enabled ALL requests must carry valid > > credentials." I believe this behavior depends on the value you set for > > the *blockUnknown* authentication parameter. > > > > > > On 06/15/2018 06:25 AM, Jan Høydahl wrote: > >> When authentication is enabled ALL requests must carry valid > credentials. > >> > >> Are you asking for a feature where a request is authenticated based on > IP address of the client, not username/password? > >> > >> Jan > >> > >> Sendt fra min iPhone > >> > >>> 14. jun. 2018 kl. 22:24 skrev Dinesh Sundaram >: > >>> > >>> Hi, > >>> > >>> I have configured basic auth for solrcloud. it works well when i > access the > >>> solr url directly. i have integrated this solr with test.com domain. > now if > >>> I access the solr url like test.com/solr it prompts the credentials > but I > >>> dont want to ask this time since it is known domain. is there any way > to > >>> achieve this. much appreciate your quick response. > >>> > >>> my security json below. i'm using the default security, want to allow > my > >>> domain default without prompting any credentials. > >>> > >>> {"authentication":{ > >>> "blockUnknown": true, > >>> "class":"solr.BasicAuthPlugin", > >>> "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= > >>> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="} > >>> },"authorization":{ > >>> "class":"solr.RuleBasedAuthorizationPlugin", > >>> "permissions":[{"name":"security-edit", > >>> "role":"admin"}], > >>> "user-role":{"solr":"admin"} > >>> }} > > > >
solr basic authentication
Hi, is there any way to disable basic authentication for particular domain. i have proxy pass from a domain to solr which is always asking credentials so wanted to disable basic auth only for that domain. is there any way? Thanks, Dinesh Sundaram.
Solr basic auth
Hi, I have configured basic auth for solrcloud. it works well when i access the solr url directly. i have integrated this solr with test.com domain. now if I access the solr url like test.com/solr it prompts the credentials but I dont want to ask this time since it is known domain. is there any way to achieve this. much appreciate your quick response. my security json below. i'm using the default security, want to allow my domain default without prompting any credentials. {"authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="} },"authorization":{ "class":"solr.RuleBasedAuthorizationPlugin", "permissions":[{"name":"security-edit", "role":"admin"}], "user-role":{"solr":"admin"} }}
solr search connectivity error with shards
Hi, I have a domain called test.com to process solr search queries with 2 solr instances like below. second URL[2] works fine but I dont want to show up the localhost, portnumber to the end user so I tried to configure like first URL[1] but it is not working. I have the proxy pass in test.com apache to redirect /select to /solr. not sure why the second url[2] is trying to call from my solr server and it is end up with connection timeout error since test.com is not in the same DMZ network. is there any way to achieve this externalizer implementation? NOTE: all the shards URL works individually when it is combined throwing n/w exception because first query reaching target server, second query is originated from target server so connection broken. [1] https://test.com/select/mediator?fl=id,locale,headline,short_description,record,title,description,url&q=(locale:en-us+OR+page_locale:en-us)+AND+((offerSearch:*test*)+OR+(content:test))&rows=8&shards=https://test.com/select/global?fl=id,https://test.com/select/test-global?fl=id&sort=start_time+asc&start=0&wt=json [2] https://test.com/select/mediator?fl=id,locale,headline,short_description,record,title,description,url&q=(locale:en-us+OR+page_locale:en-us)+AND+((offerSearch:*test*)+OR+(content:test))&rows=8&shards=https://localhost:8983/solr/global/select?q=*:*,https://localhost:8983/global/select?q=*:*&sort=start_time+asc&start=0&wt=json Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://test.com:443/select/test-global at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589)
RE: Replicate configoverlay.json
Well. I have mixed cloud and master/slave concepts since solr is supporting. Is there nay way to replicate dynamic confugurations to slave without zookeeper? --Dinesh Sundaram -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Thursday, March 8, 2018 10:23 AM To: solr-user@lucene.apache.org Subject: Re: Replicate configoverlay.json On 3/8/2018 8:48 AM, Sundaram, Dinesh wrote: > Thanks Shawn for checking this. configverlay.json is not available under > /conf. Actually it is dynamic file which is available in zookeeper log.1 > binary file. So whenever we do the config update via API it will get saved > directly in the zookeeper log.1 binary file. Is there any way to replicate to > Slaves if any update happens to this file. If you're running zookeeper, then you're running SolrCloud. And if you're running SolrCloud, then you cannot (or at least SHOULD NOT) be using master/slave replication to keep things in sync. With collections in SolrCloud, any changes to your config with the config API should take effect on all replicas as quickly as Solr can get them reloaded. This is because the configuration is not on disk, it's in zookeeper, so all replicas should be using exactly the same config. Thanks, Shawn CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Replicate configoverlay.json
Thanks Shawn for checking this. configverlay.json is not available under /conf. Actually it is dynamic file which is available in zookeeper log.1 binary file. So whenever we do the config update via API it will get saved directly in the zookeeper log.1 binary file. Is there any way to replicate to Slaves if any update happens to this file. --Dinesh Sundaram -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Wednesday, March 7, 2018 6:09 PM To: solr-user@lucene.apache.org Subject: Re: Replicate configoverlay.json On 3/6/2018 10:50 AM, Sundaram, Dinesh wrote: > Can you please share the steps to replicate configoverlay.json from > Master to Slave… in other words, how do we replicate from Master to > Slave if any configuration updated via API. If that file is in the same place as solrconfig.xml, then you would add it to the "confFiles" parameter in the master replication config. If it gets saved somewhere else, then I don't know if it would be possible. I've never used the config overlay, but it sounds like it probably gets saved in the conf directory along with the rest of the config files. https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_6-5F6_index-2Dreplication.html-23IndexReplication-2DConfiguringtheReplicationRequestHandleronaMasterServer&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=2esFnHTWn51-I-gU9Ncq0-q5D1G5Y1n22XYkbfOBJOk&s=tDqgB09Ys3qiifFnM5p1melYpGuo3_HMT7LbYGGjjAE&e= Thanks, Shawn CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
Replicate configoverlay.json
Team, Can you please share the steps to replicate configoverlay.json from Master to Slave... in other words, how do we replicate from Master to Slave if any configuration updated via API. Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image001.png@01D3B541.4529DEF0] CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: SSL configuration with Master/Slave
FYI, This has been resolved. Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image001.png@01D3892A.668B6700] From: Sundaram, Dinesh Sent: Monday, January 8, 2018 1:58 PM To: solr-user Subject: SSL configuration with Master/Slave Team, I'm facing an SSL issue while configuring Master/Slave. Master runs fine lone with SSL and Slave runs fine lone with SSL but getting SSL exception during the synch up. It gives the below error. I believe we need to trust the target server at source. Can you give me the steps to allow inbound calls at source jvm. FYI, the same synch up works fine via http. 2018-01-08 13:57:06.735 WARN (qtp33524623-16) [c:dm-global s:shard1 r:core_node2 x:dm-global_shard1_replica_n1] o.a.s.h.ReplicationHandler Exception while invoking 'details' method for replication on master org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://test21.mastercard.int:8983/solr/dm-global_shard1_replica_n1 at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.handler.IndexFetcher.getDetails(IndexFetcher.java:1823) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:954) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:332) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2484) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:534) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:745) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:
SSL configuration with Master/Slave
rverCertificate(ClientHandshaker.java:1509) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979) at sun.security.ssl.Handshaker.process_record(Handshaker.java:914) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:525) ... 39 more Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387) at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) at sun.security.validator.Validator.validate(Validator.java:260) at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491) ... 59 more Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280) at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382) ... 65 more Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image001.png@01D3.B3853E20] CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Solrcloud with Master/Slave
Thanks Shawn and Erick. I guess now we are in same track. So two independent solrcloud nodes are allowed to sync up via master/slave method without referring any external/embedded zookeepers. I need to use -cloud in the command while starting solr otherwise I'm not able to see the admin console. That console is really cool for tracking solr activities. Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, January 5, 2018 10:58 AM To: solr-user Subject: Re: Solrcloud with Master/Slave One slight correction. Solr will run perfectly fine with a single ZooKeeper. The recommendation for 3 is that running with a single ZooKeeper creates a single point of failure, i.e. if that node goes down for any reason your Solr cluster won't be able to update anything at all. You can still query, maybe, for a while. Two ZooKeepers will also run, but as Shawn says that's essentially totally wasting one of them as it doesn't buy you anything and makes your system _less_ robust. FWIW, Erick On Fri, Jan 5, 2018 at 7:57 AM, Shawn Heisey wrote: > On 1/4/2018 9:01 AM, Sundaram, Dinesh wrote: > > Thanks Shawn for your prompt response. Assume I have solrcloud A > > server > with 1 node runs on 8983 port and solrcloud B server with 1 node runs > on 8983, here I want to synch up the collection between solrcloud A > and B using the below replication handler. Is this advisable to use at > the solrcloud B ? > > > > > > > > > name="masterUrl">https://urldefense.proofpoint.com/v2/url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr&d=DwIBaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=3jqDjqHZnl2sdIUhYF4uQBQQBHzg6zEshRmcDCPcWvM&s=tEY-h4ureY9H6AkD-b9wGLmOfYQ3NJp3Rg37lNBuPgY&e=. > core.name}/replication > > 00:00:20 > > > > > > One of the things I said in my last reply, at the beginning of a > paragraph so it should have been quite prominent, was "you can't mix > master-slave replication and SolrCloud." What part of that was not clear? > > You need to be running standalone mode (not cloud) if you want to use > master-slave replication. > > When things are set up correctly, SolrCloud will automatically keep > multiple replicas in sync, and copy the index to new replicas when > they are created. There is no need to manage it with replication config. > For replicating from one SolrCloud cluster to another, there is CDCR > as Erick described. > > Another thing Erick mentioned: What you actually have when you start > Solr the way you did is two completely independent SolrCloud clusters, > each of which only has one Solr server. Each solr instance is running > a zookeeper server embedded within it. There is no redundancy or > fault tolerance of any kind. > > If you want to run a fault-tolerant SolrCloud, you will need three > separate servers. The smallest possible setup would have both Solr > and ZooKeeper running on two of those servers (as separate processes). > The Solr instances would be started with a -z option (or the ZKHOST > environment variable) to locate the three ZK servers, and without the > -cloud option. The third server, which can be a much smaller system, > would only run ZooKeeper. You may also need a load balancer, > depending on what software your clients are using. > > The requirement of three servers comes from ZooKeeper, not Solr. A > two-server ZK ensemble is actually *less* reliable than a single > server, so it's not recommended. I don't know if they even allow such > a setup to work. > > Thanks, > Shawn > > CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Solrcloud with Master/Slave
Ok thanks for your valuable reply. I want to see admin console so that I can monitor the collection details, that is the reason going to cloud mode. But here I need replication without zookeeper so had to choose regular master/slave replication. Am I mixing 2 different synchup procedures or this also okay? Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, January 4, 2018 2:06 PM To: solr-user Subject: Re: Solrcloud with Master/Slave Yes you do use ZooKeeper. Starting Solr with the -cloud option but _without_ ZK_HOST defined (or the -z parameter) starts an internal ZooKeeper on port 9983 (by default). This is evidenced by the fact that the admin UI has a "cloud" link along the left. In essence you have two separate clusters, each cluster just happens to exist on the same machine. Why bother with SolrCloud? Just configure old-style master/slave. SolrCloud is buying you nothing and running internal ZooKeepers is consuming resources for no good purpose. SolrCloud would help you if you set up a proper cluster with ZooKeeper and just had both of your nodes in the same cluster, one with replicas. That buys you HA/DR, NRT on both leader and follower etc. Up to you of course, but it's really hard to see what the purpose of running the way you are is. Best, Erick On Thu, Jan 4, 2018 at 11:38 AM, Sundaram, Dinesh < dinesh.sunda...@mastercard.com> wrote: > I want to keep both collections in sync always. This is really working > fine without any issue so far. My problem is pretty straight forward. > > I'm starting two solr instances on two servers using the below > command. I believe this command is for solrcloud mode. If so then I > have that shared replication handler config also in in my > _default/solrconfig.xml on one instance so that the slave instance > will synch with master. I don’t use zookeeper at all. Just replication > handler setting in solrconfig.xml. is this good for longtime? If not please > help me understand the issues. > > bin/solr start -cloud -p 8983 -noprompt > > > > Dinesh Sundaram > MBS Platform Engineering > > Mastercard > > > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Thursday, January 4, 2018 10:10 AM > To: solr-user > Subject: Re: Solrcloud with Master/Slave > > Whoa. I don't think you should be doing this at all. This really > appears to be an XY problem. You're asking "how to do X" without > telling us what the problem you're trying to solve is (the Y). _Why_ > do you want to set things up this way? A one-time synchronization or > to keep both collections in sync? > > > Cross Data Center Replication (CDCR) is designed to keep two separate > collections in sync on an ongoing basis. > > If this is a one-time deal, you can manually issue a replication API > "fetchindex" command. What I'd do in that case is set up your > collection B with each shard having exactly one replica (i.e. a leader > and no followers). Do the fetch and verify that your new collection is > as you want it then ADDREPLICA to build out your redundancy. > > Best, > Erick > > On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh < > dinesh.sunda...@mastercard.com> wrote: > > Thanks Shawn for your prompt response. Assume I have solrcloud A > > server > with 1 node runs on 8983 port and solrcloud B server with 1 node runs > on 8983, here I want to synch up the collection between solrcloud A > and B using the below replication handler. Is this advisable to use at > the solrcloud B ? > > > > > > > > https://urldefense.proofpoint.com/v2/ > url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core. > name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jm > CF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m= > qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP- > 8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e= > > 00:00:20 > > > > > > > > > > > > Dinesh Sundaram > > MBS Platform Engineering > > > > Mastercard > > > > > > > > -Original Message- > > From: Shawn Heisey [mailto:apa...@elyograg.org] > > Sent: Tuesday, January 2, 2018 5:33 PM > > To: solr-user@lucene.apache.org > > Subject: Re: Solrcloud with Master/Slave > > > > On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote: > >> I have spun up single solrcloud node on 2 servers. > > > > This makes no sense. If you have two servers, then you probably > > have > more than a single node. > > > >> tried to sy
RE: Solrcloud with Master/Slave
I want to keep both collections in sync always. This is really working fine without any issue so far. My problem is pretty straight forward. I'm starting two solr instances on two servers using the below command. I believe this command is for solrcloud mode. If so then I have that shared replication handler config also in in my _default/solrconfig.xml on one instance so that the slave instance will synch with master. I don’t use zookeeper at all. Just replication handler setting in solrconfig.xml. is this good for longtime? If not please help me understand the issues. bin/solr start -cloud -p 8983 -noprompt Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, January 4, 2018 10:10 AM To: solr-user Subject: Re: Solrcloud with Master/Slave Whoa. I don't think you should be doing this at all. This really appears to be an XY problem. You're asking "how to do X" without telling us what the problem you're trying to solve is (the Y). _Why_ do you want to set things up this way? A one-time synchronization or to keep both collections in sync? Cross Data Center Replication (CDCR) is designed to keep two separate collections in sync on an ongoing basis. If this is a one-time deal, you can manually issue a replication API "fetchindex" command. What I'd do in that case is set up your collection B with each shard having exactly one replica (i.e. a leader and no followers). Do the fetch and verify that your new collection is as you want it then ADDREPLICA to build out your redundancy. Best, Erick On Thu, Jan 4, 2018 at 8:01 AM, Sundaram, Dinesh wrote: > Thanks Shawn for your prompt response. Assume I have solrcloud A server with > 1 node runs on 8983 port and solrcloud B server with 1 node runs on 8983, > here I want to synch up the collection between solrcloud A and B using the > below replication handler. Is this advisable to use at the solrcloud B ? > > > > name="masterUrl">https://urldefense.proofpoint.com/v2/url?u=http-3A__solrcloudA-3A8983_solr_-24-257Bsolr.core.name-257D_replication&d=DwIFaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=qBCZxvmkOHW9jt8JM8dVSQJuulIJp3Xk2hXvC5bL7DM&s=xGP-8z2aGBFGrtjIbBMFB6f2cfE4bukyOctAVK_HkyI&e= > 00:00:20 > > > > > > Dinesh Sundaram > MBS Platform Engineering > > Mastercard > > > > -Original Message- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: Tuesday, January 2, 2018 5:33 PM > To: solr-user@lucene.apache.org > Subject: Re: Solrcloud with Master/Slave > > On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote: >> I have spun up single solrcloud node on 2 servers. > > This makes no sense. If you have two servers, then you probably have more > than a single node. > >> tried to synch up the data b/w those servers via zookeeper > > This is not done with zookeeper. SolrCloud should handle it automatically. > SolrCloud uses the zookeeper database to *coordinate* keeping machines in > sync, but it's Solr that does the work, not zookeeper. > > This makes even less sense when taken in context with the previous sentence. > If you only have a single node, then you can't possibly sync between them. > >> but didn’t work well due to out of memory issues, ensemble issues >> with multiple ports connectivity. So had to move to Master slave >> replication b/w those 2 solrcloud nodes. I couldn’t find any issues >> so far. Is this advisable? Because I’m wondering that looks like >> mixing up solrcloud and master/slave replication. > > If you're getting OOME problems, then whatever program threw the OOME most > likely needs more heap. Or you need to take steps to reduce the amount of > heap that's required. Note that this second option might not actually be > possible ... increasing the heap is probably the only option you have. Since > version 5.0, Solr has shipped with the default heap set to 512MB, which is > extremely small. Most users need to increase it. > > You can't mix master-slave replication and SolrCloud. SolrCloud takes over > the replication feature for its own purposes. Trying to mix these is going > to cause you problems. You may not run into the problems immediately, but it > is likely that you would run into a problem eventually. Data loss would be > possible. > > The latest versions of Solr have new SolrCloud replication types that closely > mimic the old master-slave replication. > > Perhaps you should start over and describe what you've actually seen -- > exactly what you've done and configure
RE: Solrcloud with Master/Slave
Thanks Shawn for your prompt response. Assume I have solrcloud A server with 1 node runs on 8983 port and solrcloud B server with 1 node runs on 8983, here I want to synch up the collection between solrcloud A and B using the below replication handler. Is this advisable to use at the solrcloud B ? http://solrcloudA:8983/solr/${solr.core.name}/replication 00:00:20 Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Tuesday, January 2, 2018 5:33 PM To: solr-user@lucene.apache.org Subject: Re: Solrcloud with Master/Slave On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote: > I have spun up single solrcloud node on 2 servers. This makes no sense. If you have two servers, then you probably have more than a single node. > tried to synch up the data b/w those servers via zookeeper This is not done with zookeeper. SolrCloud should handle it automatically. SolrCloud uses the zookeeper database to *coordinate* keeping machines in sync, but it's Solr that does the work, not zookeeper. This makes even less sense when taken in context with the previous sentence. If you only have a single node, then you can't possibly sync between them. > but didn’t work well due to out of memory issues, ensemble issues with > multiple ports connectivity. So had to move to Master slave > replication b/w those 2 solrcloud nodes. I couldn’t find any issues so > far. Is this advisable? Because I’m wondering that looks like mixing > up solrcloud and master/slave replication. If you're getting OOME problems, then whatever program threw the OOME most likely needs more heap. Or you need to take steps to reduce the amount of heap that's required. Note that this second option might not actually be possible ... increasing the heap is probably the only option you have. Since version 5.0, Solr has shipped with the default heap set to 512MB, which is extremely small. Most users need to increase it. You can't mix master-slave replication and SolrCloud. SolrCloud takes over the replication feature for its own purposes. Trying to mix these is going to cause you problems. You may not run into the problems immediately, but it is likely that you would run into a problem eventually. Data loss would be possible. The latest versions of Solr have new SolrCloud replication types that closely mimic the old master-slave replication. Perhaps you should start over and describe what you've actually seen -- exactly what you've done and configured, and how the results differed from your expectations. Precise commands entered will be helpful. Thanks, Shawn CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
Solrcloud with Master/Slave
Hi, I have spun up single solrcloud node on 2 servers. tried to synch up the data b/w those servers via zookeeper but didn't work well due to out of memory issues, ensemble issues with multiple ports connectivity. So had to move to Master slave replication b/w those 2 solrcloud nodes. I couldn't find any issues so far. Is this advisable? Because I'm wondering that looks like mixing up solrcloud and master/slave replication. Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image001.png@01D383E7.39FA10D0] CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Solr ssl issue while creating collection
Thanks Erick for your valuable reply. Much Appreciated !!! Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, December 15, 2017 5:17 PM To: solr-user Subject: Re: Solr ssl issue while creating collection No. ZooKeeper is an integral part of SolrCloud, without it you don't _have_ SolrCloud. Best, Erick On Fri, Dec 15, 2017 at 1:03 PM, Sundaram, Dinesh wrote: > Thanks again for your valuable reply. Yes that’s correct. Is there a way to > start solr alone without any embedded/external zookeeper in solrcloud mode? > > > Dinesh Sundaram > MBS Platform Engineering > > Mastercard > > > > -Original Message- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: Wednesday, December 13, 2017 4:54 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr ssl issue while creating collection > > On 12/13/2017 3:16 PM, Sundaram, Dinesh wrote: >> Thanks Shawn for your input, Is this errors specific only for zookeeper >> operations? If so is there any way to turn off default zookeeper which runs >> on 9983? > > If you don't want to start the embedded zookeeper, then you want to be sure > that you have a zkHost defined which lists all of the hosts in your external > ensemble. You can either define ZK_HOST in the include script, or use the -z > option when starting Solr manually. When Solr is provided with information > about ZK hosts, it does NOT start the embedded ZK. > > The exceptions you're seeing have nothing to do with zookeeper. The latest > exception you mentioned is caused by one SolrCloud instance sending HTTPS > requests to another SolrCloud instance, and failing to validate SSL because > the hostname doesn't match the info in the certificate. > > Thanks, > Shawn > > > CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for > the use of the intended recipient and may contain information that is > privileged, confidential or exempt from disclosure under applicable law. If > you are not the intended recipient, any disclosure, distribution or other use > of this e-mail message or attachments is prohibited. If you have received > this e-mail message in error, please delete and notify the sender > immediately. Thank you.
RE: Solr ssl issue while creating collection
Thanks again for your valuable reply. Yes that’s correct. Is there a way to start solr alone without any embedded/external zookeeper in solrcloud mode? Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Wednesday, December 13, 2017 4:54 PM To: solr-user@lucene.apache.org Subject: Re: Solr ssl issue while creating collection On 12/13/2017 3:16 PM, Sundaram, Dinesh wrote: > Thanks Shawn for your input, Is this errors specific only for zookeeper > operations? If so is there any way to turn off default zookeeper which runs > on 9983? If you don't want to start the embedded zookeeper, then you want to be sure that you have a zkHost defined which lists all of the hosts in your external ensemble. You can either define ZK_HOST in the include script, or use the -z option when starting Solr manually. When Solr is provided with information about ZK hosts, it does NOT start the embedded ZK. The exceptions you're seeing have nothing to do with zookeeper. The latest exception you mentioned is caused by one SolrCloud instance sending HTTPS requests to another SolrCloud instance, and failing to validate SSL because the hostname doesn't match the info in the certificate. Thanks, Shawn CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Solr ssl issue while creating collection
Thanks Shawn for your input, Is this errors specific only for zookeeper operations? If so is there any way to turn off default zookeeper which runs on 9983? Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Wednesday, December 13, 2017 11:38 AM To: solr-user@lucene.apache.org Subject: Re: Solr ssl issue while creating collection On 12/13/2017 10:06 AM, Sundaram, Dinesh wrote: > Thanks Shawn, this helps. Now getting the below exception, is there any way > to avoid verifying this? > > 2017-12-13 17:00:39.239 DEBUG > (httpShardExecutor-4-thread-1-processing-n:xx.xx.xx.xx:8983_solr > [https://urldefense.proofpoint.com/v2/url?u=https-3Axx.xx.xx.xx-3A8983__solr&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=v4DznkLF4VBvrVleiFON0I41uu_NPGd1TpVYs3q0Hro&s=eqDSyAa-0UCXm_IT2YoWaZDjMb5zM5Uv8-9Zcidjlec&e=] > > https://urldefense.proofpoint.com/v2/url?u=https-3Axx.xx.xx.xx-3A8983__solr&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=v4DznkLF4VBvrVleiFON0I41uu_NPGd1TpVYs3q0Hro&s=eqDSyAa-0UCXm_IT2YoWaZDjMb5zM5Uv8-9Zcidjlec&e=) > [ ] o.a.h.c.s.DefaultHostnameVerifier Certificate for > doesn't match common name of the certificate subject: xx.xx.xx.xx.com > javax.net.ssl.SSLPeerUnverifiedException: Certificate for > doesn't match common name of the certificate subject: > xx.xx.xx.xx.com If you're running 6.x, then you can disable the hostname verification. But if you're running 7.x, there's a bug that breaks it: https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SOLR-2D9304&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=v4DznkLF4VBvrVleiFON0I41uu_NPGd1TpVYs3q0Hro&s=mX_wS19NYYqBsWUI3qCXAXBbY-3p8Vjkzq4K3BFfgdk&e= There's a patch on the issue, but it hasn't been tested, so I have no idea whether it works. Even if it works, the patch is incomplete because it doesn't have a test to verify the problem doesn't happen again. An alternate idea would be to add all the possible hostnames to the certificate you're using, and make sure the trust stores are valid, so all of the cert verification will work. Thanks, Shawn CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
RE: Solr ssl issue while creating collection
Thanks Shawn, this helps. Now getting the below exception, is there any way to avoid verifying this? 2017-12-13 17:00:39.239 DEBUG (httpShardExecutor-4-thread-1-processing-n:xx.xx.xx.xx:8983_solr [https:xx.xx.xx.xx:8983//solr] https:xx.xx.xx.xx:8983//solr) [ ] o.a.h.c.s.DefaultHostnameVerifier Certificate for doesn't match common name of the certificate subject: xx.xx.xx.xx.com javax.net.ssl.SSLPeerUnverifiedException: Certificate for doesn't match common name of the certificate subject: xx.xx.xx.xx.com 2017-12-13 17:00:39.242 ERROR (OverseerThreadFactory-8-thread-1-processing-n:xx.xx.xx.xx:8983_solr) [ ] o.a.s.c.OverseerCollectionMessageHandler Error from shard: https://xx.xx.xx.xx:8983/solr org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://xx.xx.xx.xx:8983/solr at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for doesn't match any of the subject alternative names: [] Dinesh Sundaram MBS Platform Engineering Mastercard -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Monday, December 11, 2017 2:26 PM To: solr-user@lucene.apache.org Subject: Re: Solr ssl issue while creating collection On 12/11/2017 12:24 PM, Sundaram, Dinesh wrote: > 1. Configure SSL > using > https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org > _solr_guide_7-5F1_enabling-2Dssl.html&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xT > CjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m > =kX8SMKw_W4qlgQyvl3p8pLrhYorEW4_wklVchKw6jAA&s=Gz_ER-vMMwpE5j1YpqIrjnf > _P3SM7uPI-kpjGdeATR8&e= > > 2. Restart solr > 3. Validate solr with https url > https://urldefense.proofpoint.com/v2/url?u=https-3A__localhost-3A8983_ > solr&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7 > y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=kX8SMKw_W4qlgQyvl3p8pLrhYorEW4_w > klVchKw6jAA&s=EJ68RQ28Gn6vNdedX5n0hue_hgqlEWR9jFWoEbkt7J4&e= - works > fine 4. Create a collection > https://urldefense.proofpoint.com/v2/url?u=https-3A__localhost-3A8983_ > solr_-23_-7Ecollections&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP > 0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=kX8SMKw_W4qlg > Qyvl3p8pLrhYorEW4_wklVchKw6jAA&s=weJY5eOZccSQqlLFr5CAH7PEyWPL1fb5VaWKG > AjYAJs&e= > <https://urldefense.proofpoint.com/v2/url?u=https-3A__localhost-3A8983 > _solr_-23_-257Ecollections&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF > 6SP0bDlmMmY&r=gCFZFMR7y0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=kX8SMKw_W4 > qlgQyvl3p8pLrhYorEW4_wklVchKw6jAA&s=CdjOnW9WrZwGNv5Rr3kEke61pipUE8kMVA > 9DzaYluRU&e=> > 5. here is the response : > Connection to Solr lost > Please check the Solr instance. > 6.Server solr.log: here notice the replica call goes to http port > instead of https > > 2017-12-11 11:52:27.929 ERROR > (OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [ > ] o.a.s.c.OverseerCollectionMessageHandler Error from > shard: > https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A8983_s > olr&d=DwIDaQ&c=uc5ZRXl8dGLM1RMQwf7xTCjRqXF0jmCF6SP0bDlmMmY&r=gCFZFMR7y > 0gzhIBFz1lKTqHFMl-3R6gq7ojE0Eam2Eg&m=kX8SMKw_W4qlgQyvl3p8pLrhYorEW4_wk > lVchKw6jAA&s=MjaZgIhWcEaKn00NFazu0zGn3HFKeSuYOlhyKe9RJMs&e= > This acts like either you did not set the urlScheme cluster property in zookeeper to https, or that you did not restart your Solr instances after making that change. Setting the property is described on the page you referenced in the "SSL with SolrCloud" section. Note that it also appears your So
RE: Solr ssl issue while creating collection
Hi, How do I change the protocol to https everywhere including replica. NOTE: I have just only one node 8983. started solr using this command. bin/solr start -cloud -p 8983 -noprompt 1. Configure SSL using https://lucene.apache.org/solr/guide/7_1/enabling-ssl.html 2. Restart solr 3. Validate solr with https url https://localhost:8983/solr - works fine 4. Create a collection https://localhost:8983/solr/#/~collections 5. here is the response : Connection to Solr lost Please check the Solr instance. 6.Server solr.log: here notice the replica call goes to http port instead of https 2017-12-11 11:52:27.929 ERROR (OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [ ] o.a.s.c.OverseerCollectionMessageHandler Error from shard: http://localhost:8983/solr org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8983/solr at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:525) ... 12 more Caused by: org.apache.http.ProtocolException: The server failed to respond with a valid HTTP response at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:149) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ... 15 more Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image002.png@01D37287.37BC38F0] CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
Solr ssl issue while creating collection
Hi, How do I change the protocol to https everywhere including replica. NOTE: I have just only one node 8983. started solr using this command. bin/solr start -cloud -p 8983 -noprompt 1. Configure SSL using https://lucene.apache.org/solr/guide/7_1/enabling-ssl.html 2. Restart solr 3. Validate solr with https url https://localhost:8983/solr - works fine 4. Create a collection https://localhost:8983/solr/#/~collections 5. here is the response : Connection to Solr lost Please check the Solr instance. 6.Server solr.log: here notice the replica call goes to http port instead of https 2017-12-11 11:52:27.929 ERROR (OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [ ] o.a.s.c.OverseerCollectionMessageHandler Error from shard: http://localhost:8983/solr org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8983/solr at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:525) ... 12 more Caused by: org.apache.http.ProtocolException: The server failed to respond with a valid HTTP response at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:149) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ... 15 more Dinesh Sundaram MBS Platform Engineering Mastercard [cid:image001.png@01D37283.5B72AA80] CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
Re: how to achieve mulitple wild card searches in solr 5.2.1
Thanks Erick, I tried making it to String, but i need to compress the part first and then look for wild card search? With string i can not do that. How do i achieve this? On Wed, Jan 4, 2017 at 2:52 AM, Erick Erickson wrote: > My guess is that you're searching on a _tokenized_ field and that > you'd get the results you expect on a string field.. > > Add &debug=query to the URL and you'll see what the parsed query is > and that'll give you a very good idea of what's acaully happening. > > Best, > Erick > > On Tue, Jan 3, 2017 at 7:16 AM, dinesh naik > wrote: > > Hi all, > > How can we achieve multiple wild card searches in solr? > > > > For example: I am searching for AB TEST1.EC*TEST2* > > But I get also results for AB TEST1.EC*TEST3*, AB TEST1.EC*TEST4*,? > instead > > of AB TEST1.EC*TEST2* > > > > It seems only the first * is being considered, second * is not considered > > for wildcard match. > > -- > > Best Regards, > > Dinesh Naik > -- Best Regards, Dinesh Naik
how to achieve mulitple wild card searches in solr 5.2.1
Hi all, How can we achieve multiple wild card searches in solr? For example: I am searching for AB TEST1.EC*TEST2* But I get also results for AB TEST1.EC*TEST3*, AB TEST1.EC*TEST4*,? instead of AB TEST1.EC*TEST2* It seems only the first * is being considered, second * is not considered for wildcard match. -- Best Regards, Dinesh Naik
Re: solr-5.2.1: All folders in solr box(Linux) are sitting in RAM
Hi Eric, Thanks a lot. Got the point. On Sep 21, 2016 10:18 PM, "Erick Erickson" wrote: > Why do you want to avoid this? Having the index in RAM (either the JVM or > OS) > is essential to fast querying. Perhaps you're being mislead by the > MMapDirectory's > consumption of the OS memory? See Uwe's excellent article here: > > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Best, > Erick > > On Wed, Sep 21, 2016 at 1:45 AM, dinesh naik > wrote: > > Hi all, > > > > i have a linux box with 48GB RAM . > > > > In this box i have solr and jdk installed. i have few other folders as > > well. > > > > [solruser@server1 ~]$ du -sh * > > 4.0Kclusterstate.json > > 1.5MConf > > 15G jdk1.8.0_25 > > 151Mjdk_old > > 262Mjvm_1.7 > > 538Mscripts > > 11G solrhome > > > > My actual index size is 9GB (inside solr installation directory solrhome) > > .In solr admin UI the physical memory shows 32GB. > > > > It seems all the folders are sitting in RAM . Kindly suggest how can i > > avoid this? > > > > -- > > Best Regards, > > Dinesh Naik >
solr-5.2.1: All folders in solr box(Linux) are sitting in RAM
Hi all, i have a linux box with 48GB RAM . In this box i have solr and jdk installed. i have few other folders as well. [solruser@server1 ~]$ du -sh * 4.0Kclusterstate.json 1.5MConf 15G jdk1.8.0_25 151Mjdk_old 262Mjvm_1.7 538Mscripts 11G solrhome My actual index size is 9GB (inside solr installation directory solrhome) .In solr admin UI the physical memory shows 32GB. It seems all the folders are sitting in RAM . Kindly suggest how can i avoid this? -- Best Regards, Dinesh Naik
Which handler handles calls to retrieve java properties and Cores
Hi, My requirement is that Solr admin should not display any sensitive information for hackers to utilise. I would like to suppress calls to java properties and cores (URLs given below). Means I don't want Solr return results for these URLs. http://localhost:8080/solr/admin/info/properties?wt=json http://localhost:8080/solr/admin/cores?wt=json Does anyone know which handler handles these calls ( I commented out the admin handler, see below, in SolrConfig.xml. But no luck ). Or any other mechanism to stop returning java properties /cores information? Thanks, Dinesh
Prevent the SSL Truststore password from showing up in plain text the Solr Admin
Hi, In the Solr admin console, the java properties shows the “javax.net.ssl.trustStorePassword” password in plain text. “javax.net.ssl.trustStorePassword” is the password provided for the java trust store to store the trusted SSL certificates. 1) Is there a way to mask the password in Solr? 2) If not is it possible to hide the “Java Properties” option ? 3) Or what is the recommended best practice for this issue? [cid:image001.jpg@01D1C1B0.6AAF1900] Thanks, Dinesh
Re: Different boost values for multiple parsers in Solr 5.2.1
Hi Upayavira, We have an issue here. The boosting work as expected when we run the query from Admin console: where we pass q and bq param as below. q=(((_query_:"{!synonym_edismax qf='itemname OR itemnumber OR itemdesc' v='HTC' bq='' mm=100 synonyms=true synonyms.constructPhrases=true synonyms.ignoreQueryOperators=true}") OR (itemname:"HTC" OR itemnamecomp:HTC* OR itemnumber:"HTC" OR itemnumbercomp:HTC* OR itemdesc:"HTC"~500)) AND (warehouse:Ind02 OR warehouse:Ind03 OR warehouse:Ind04 )) bq=warehouse:Ind02^1000 This works absolutely fine when tried from Admin cosnole. But when we use SolrJ API , we are not geting the expected boost value being returned in score field. We are using SolrQuery class for adding the bq parameter. queryEngine.set("bq", boostQuery); where boostQuery is : warehouse:Ind02^1000 How can we handle this. Is this becuase of bq='' being used for synonym_edismax parser? On Tue, Sep 8, 2015 at 5:49 PM, dinesh naik wrote: > Thanks Alot Upayavira. It worked as expected. > > > On Tue, Sep 8, 2015 at 2:09 PM, Upayavira wrote: > >> you can add bq= inside your {!synonym_edismax} section, if you wish and >> it will apply to that query parser only. >> >> Upayavira >> >> On Mon, Sep 7, 2015, at 03:05 PM, dinesh naik wrote: >> > Please find below the detail: >> > >> > My main query is like this: >> > >> > q=(((_query_:"{!synonym_edismax qf='itemname OR itemnumber OR itemdesc' >> > v='HTC' mm=100 synonyms=true synonyms.constructPhrases=true >> > synonyms.ignoreQueryOperators=true}") OR (itemname:"HTC" OR >> > itemnamecomp:HTC* OR itemnumber:"HTC" OR itemnumbercomp:HTC* OR >> > itemdesc:"HTC"~500)) AND (warehouse:Ind02 OR warehouse:Ind03 OR >> > warehouse:Ind04 )) >> > >> > Giving Boost of 1000 for warehouse Ind02 >> > using below parameter: >> > >> > bq=warehouse:Ind02^1000 >> > >> > >> > Here i am expecting a boost of 1004 but , somehow 1000 is added extra >> may >> > be because of my additional parser. How can i avoid this? >> > >> > >> > Debug information for the boost : >> > >> > >> > 2004.0 = sum of: >> > 1004.0 = sum of: >> > 1003.0 = sum of: >> > 1001.0 = sum of: >> > 1.0 = max of: >> > 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], >> result >> > of: >> > 1.0 = fieldWeight in 235500, product of: >> > 1.0 = tf(freq=1.0), with freq of: >> > 1.0 = termFreq=1.0 >> > 1.0 = idf(docFreq=26, maxDocs=1738053) >> > 1.0 = fieldNorm(doc=235500) >> > 1000.0 = weight(warehouse:e02^1000.0 in 235500) >> > [CustomSimilarity], >> > result of: >> > 1000.0 = score(doc=235500,freq=1.0), product of: >> > 1000.0 = queryWeight, product of: >> > 1000.0 = boost >> > 1.0 = idf(docFreq=416190, maxDocs=1738053) >> > 1.0 = queryNorm >> > 1.0 = fieldWeight in 235500, product of: >> > 1.0 = tf(freq=1.0), with freq of: >> > 1.0 = termFreq=1.0 >> > 1.0 = idf(docFreq=416190, maxDocs=1738053) >> > 1.0 = fieldNorm(doc=235500) >> > 2.0 = sum of: >> > 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], result >> > of: >> > 1.0 = fieldWeight in 235500, product of: >> > 1.0 = tf(freq=1.0), with freq of: >> > 1.0 = termFreq=1.0 >> > 1.0 = idf(docFreq=26, maxDocs=1738053) >> > 1.0 = fieldNorm(doc=235500) >> > 1.0 = itemnamecomp:HTC*, product of: >> > 1.0 = boost >> > 1.0 = queryNorm >> > 1.0 = sum of: >> > 1.0 = weight(warehouse:e02 in 235500) [CustomSimilarity], result >> > of: >> > 1.0 = fieldWeight in 235500, product of: >> > 1.0 = tf(freq=1.0), with freq of: >> > 1.0 = termFreq=1.0 >> > 1.0 = idf(docFreq=416190, maxDocs=1738053) >> > 1.0 = fieldNorm(doc=235500) >> > 1000.0 = weight(warehouse:e02^1000.0 in 235500) [CustomSimilarity], >> > result of: >> > 1000.0 = score(doc=235500,freq=1.0), product of: >> > 1000.0 = qu
Re: Different boost values for multiple parsers in Solr 5.2.1
Thanks Alot Upayavira. It worked as expected. On Tue, Sep 8, 2015 at 2:09 PM, Upayavira wrote: > you can add bq= inside your {!synonym_edismax} section, if you wish and > it will apply to that query parser only. > > Upayavira > > On Mon, Sep 7, 2015, at 03:05 PM, dinesh naik wrote: > > Please find below the detail: > > > > My main query is like this: > > > > q=(((_query_:"{!synonym_edismax qf='itemname OR itemnumber OR itemdesc' > > v='HTC' mm=100 synonyms=true synonyms.constructPhrases=true > > synonyms.ignoreQueryOperators=true}") OR (itemname:"HTC" OR > > itemnamecomp:HTC* OR itemnumber:"HTC" OR itemnumbercomp:HTC* OR > > itemdesc:"HTC"~500)) AND (warehouse:Ind02 OR warehouse:Ind03 OR > > warehouse:Ind04 )) > > > > Giving Boost of 1000 for warehouse Ind02 > > using below parameter: > > > > bq=warehouse:Ind02^1000 > > > > > > Here i am expecting a boost of 1004 but , somehow 1000 is added extra may > > be because of my additional parser. How can i avoid this? > > > > > > Debug information for the boost : > > > > > > 2004.0 = sum of: > > 1004.0 = sum of: > > 1003.0 = sum of: > > 1001.0 = sum of: > > 1.0 = max of: > > 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], result > > of: > > 1.0 = fieldWeight in 235500, product of: > > 1.0 = tf(freq=1.0), with freq of: > > 1.0 = termFreq=1.0 > > 1.0 = idf(docFreq=26, maxDocs=1738053) > > 1.0 = fieldNorm(doc=235500) > > 1000.0 = weight(warehouse:e02^1000.0 in 235500) > > [CustomSimilarity], > > result of: > > 1000.0 = score(doc=235500,freq=1.0), product of: > > 1000.0 = queryWeight, product of: > > 1000.0 = boost > > 1.0 = idf(docFreq=416190, maxDocs=1738053) > > 1.0 = queryNorm > > 1.0 = fieldWeight in 235500, product of: > > 1.0 = tf(freq=1.0), with freq of: > > 1.0 = termFreq=1.0 > > 1.0 = idf(docFreq=416190, maxDocs=1738053) > > 1.0 = fieldNorm(doc=235500) > > 2.0 = sum of: > > 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], result > > of: > > 1.0 = fieldWeight in 235500, product of: > > 1.0 = tf(freq=1.0), with freq of: > > 1.0 = termFreq=1.0 > > 1.0 = idf(docFreq=26, maxDocs=1738053) > > 1.0 = fieldNorm(doc=235500) > > 1.0 = itemnamecomp:HTC*, product of: > > 1.0 = boost > > 1.0 = queryNorm > > 1.0 = sum of: > > 1.0 = weight(warehouse:e02 in 235500) [CustomSimilarity], result > > of: > > 1.0 = fieldWeight in 235500, product of: > > 1.0 = tf(freq=1.0), with freq of: > > 1.0 = termFreq=1.0 > > 1.0 = idf(docFreq=416190, maxDocs=1738053) > > 1.0 = fieldNorm(doc=235500) > > 1000.0 = weight(warehouse:e02^1000.0 in 235500) [CustomSimilarity], > > result of: > > 1000.0 = score(doc=235500,freq=1.0), product of: > > 1000.0 = queryWeight, product of: > > 1000.0 = boost > > 1.0 = idf(docFreq=416190, maxDocs=1738053) > > 1.0 = queryNorm > > 1.0 = fieldWeight in 235500, product of: > > 1.0 = tf(freq=1.0), with freq of: > > 1.0 = termFreq=1.0 > > 1.0 = idf(docFreq=416190, maxDocs=1738053) > > 1.0 = fieldNorm(doc=235500) > > > > > > On Mon, Sep 7, 2015 at 7:21 PM, dinesh naik > > wrote: > > Hi all, > > > > Is there a way to apply different boost , using bq parameter for > > different > > parser. > > > > for example if i am using a synonym parser and edismax parser in a single > > query, my bq param value is getting applied for both the parser making > > the > > boost value double. > > > > -- > > Best Regards, > > Dinesh Naik > > > > > > > > > > > > On Mon, Sep 7, 2015 at 7:21 PM, dinesh naik > > wrote: > > > > > Hi all, > > > > > > Is there a way to apply different boost , using bq parameter for > different > > > parser. > > > > > > for example if i am using a synonym parser and edismax parser in a > single > > > query, my bq param value is getting applied for both the parser making > the > > > boost value double. > > > > > > -- > > > Best Regards, > > > Dinesh Naik > > > > > > > > > > > -- > > Best Regards, > > Dinesh Naik > -- Best Regards, Dinesh Naik
Re: Different boost values for multiple parsers in Solr 5.2.1
Please find below the detail: My main query is like this: q=(((_query_:"{!synonym_edismax qf='itemname OR itemnumber OR itemdesc' v='HTC' mm=100 synonyms=true synonyms.constructPhrases=true synonyms.ignoreQueryOperators=true}") OR (itemname:"HTC" OR itemnamecomp:HTC* OR itemnumber:"HTC" OR itemnumbercomp:HTC* OR itemdesc:"HTC"~500)) AND (warehouse:Ind02 OR warehouse:Ind03 OR warehouse:Ind04 )) Giving Boost of 1000 for warehouse Ind02 using below parameter: bq=warehouse:Ind02^1000 Here i am expecting a boost of 1004 but , somehow 1000 is added extra may be because of my additional parser. How can i avoid this? Debug information for the boost : 2004.0 = sum of: 1004.0 = sum of: 1003.0 = sum of: 1001.0 = sum of: 1.0 = max of: 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], result of: 1.0 = fieldWeight in 235500, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=26, maxDocs=1738053) 1.0 = fieldNorm(doc=235500) 1000.0 = weight(warehouse:e02^1000.0 in 235500) [CustomSimilarity], result of: 1000.0 = score(doc=235500,freq=1.0), product of: 1000.0 = queryWeight, product of: 1000.0 = boost 1.0 = idf(docFreq=416190, maxDocs=1738053) 1.0 = queryNorm 1.0 = fieldWeight in 235500, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=416190, maxDocs=1738053) 1.0 = fieldNorm(doc=235500) 2.0 = sum of: 1.0 = weight(itemname:HTC in 235500) [CustomSimilarity], result of: 1.0 = fieldWeight in 235500, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=26, maxDocs=1738053) 1.0 = fieldNorm(doc=235500) 1.0 = itemnamecomp:HTC*, product of: 1.0 = boost 1.0 = queryNorm 1.0 = sum of: 1.0 = weight(warehouse:e02 in 235500) [CustomSimilarity], result of: 1.0 = fieldWeight in 235500, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=416190, maxDocs=1738053) 1.0 = fieldNorm(doc=235500) 1000.0 = weight(warehouse:e02^1000.0 in 235500) [CustomSimilarity], result of: 1000.0 = score(doc=235500,freq=1.0), product of: 1000.0 = queryWeight, product of: 1000.0 = boost 1.0 = idf(docFreq=416190, maxDocs=1738053) 1.0 = queryNorm 1.0 = fieldWeight in 235500, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.0 = idf(docFreq=416190, maxDocs=1738053) 1.0 = fieldNorm(doc=235500) On Mon, Sep 7, 2015 at 7:21 PM, dinesh naik wrote: Hi all, Is there a way to apply different boost , using bq parameter for different parser. for example if i am using a synonym parser and edismax parser in a single query, my bq param value is getting applied for both the parser making the boost value double. -- Best Regards, Dinesh Naik On Mon, Sep 7, 2015 at 7:21 PM, dinesh naik wrote: > Hi all, > > Is there a way to apply different boost , using bq parameter for different > parser. > > for example if i am using a synonym parser and edismax parser in a single > query, my bq param value is getting applied for both the parser making the > boost value double. > > -- > Best Regards, > Dinesh Naik > -- Best Regards, Dinesh Naik
Different boost values for multiple parsers in Solr 5.2.1
Hi all, Is there a way to apply different boost , using bq parameter for different parser. for example if i am using a synonym parser and edismax parser in a single query, my bq param value is getting applied for both the parser making the boost value double. -- Best Regards, Dinesh Naik
Solr 5.2 index time field boost not working as expected
Hi all, We need to boost a field in a document if field matches certain criteria. For example: if title contains "Secrete" , then we want to boost the field to 100 . For this we have the below code in solrj api while indexing the document: Collection docs = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("title", "Secrete" , 100.0f); // Field Boost doc.addField("id", 11); doc.addField("modelnumber", "AK10005"); doc.addField("name", "XX5"); docs.add(doc); Also , we made omitNorms="false" for this field in schema.xml But still we do not see this document coming at the top. Is there any other setting which has to be done for index time boosting? Best Regards, Dinesh Naik -- Best Regards, Dinesh Naik
Re: Restore index API does not work in solr 5.1.0 ?
Hi all, How can we restore index in Solr 5.1.0 ? Best Regards, Dinesh Naik On Thu, Jul 9, 2015 at 6:54 PM, dinesh naik wrote: > Hi all, > > How can we restore the index in Solr 5.1.0 ? > > We did following: > > 1:- Started Solr Cloud from: > > bin/solr start -e cloud -noprompt > > > > 2:- posted some documents to solr from examples folder using : > > java -Dc=gettingstarted -jar post.jar *.xml > > > > 3:- Backed up the Index using: > > http://localhost:8983/solr/gettingstarted/replication?command=backup > > > > 4:- Deleted 1 document using: > > > http://localhost:8983/solr/gettingstarted/update?stream.body=id:"IW-02"&commit=true > > > > 5:- restored the index using: > > http://localhost:8983/solr/gettingstarted/replication?command=restore > > > > The Restore works fine with same steps for 5.2 versions but not 5.1 > > Is there any other way to restore index in Solr 5.1.0? > > -- > Best Regards, > Dinesh Naik > -- Best Regards, Dinesh Naik
Restore index API does not work in solr 5.1.0 ?
Hi all, How can we restore the index in Solr 5.1.0 ? We did following: 1:- Started Solr Cloud from: bin/solr start -e cloud -noprompt 2:- posted some documents to solr from examples folder using : java -Dc=gettingstarted -jar post.jar *.xml 3:- Backed up the Index using: http://localhost:8983/solr/gettingstarted/replication?command=backup 4:- Deleted 1 document using: http://localhost:8983/solr/gettingstarted/update?stream.body=id:"IW-02"&commit=true 5:- restored the index using: http://localhost:8983/solr/gettingstarted/replication?command=restore The Restore works fine with same steps for 5.2 versions but not 5.1 Is there any other way to restore index in Solr 5.1.0? -- Best Regards, Dinesh Naik
Re: Synonym with Proximity search in solr 5.1.0
Hi Alessandro, I have gone through the above suggested links, but i am not able to achieve the above expected result. The issue here is , my searched text is a part of field 'text' . I like nokia mobile searched text: "nokia mobile"~500. Best Regards, Dinesh Naik On Wed, Jul 8, 2015 at 8:36 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Showing your debug query would clarify the situation, but I assume you got > into a classic multi-word synonym problem[1] . > Hope the documents I pointed out are good for you. > > Cheers > > [1] http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ > [2] > > http://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > 2015-07-08 15:47 GMT+01:00 dinesh naik : > > > Hi, > > > > We have a synonym file with below content: > > > > 1 > > 2 > > cell phone ,nokia mobile > > > > And we have 3 documents: > > > > doc1: > > > > 3 > > 4 > > 5 > > 6 > > 7 > > 8 > > 9 > > 10 > > 11 > > 12 > > 13 > > 14 > > 15 > > 16 > > 17 > > 18 > > 19 > > 20 > > 21 > > 22 > > 23 > > 24 > > 25 > > 26 > > 27 > > 28 > > 29 > > 30 > > 31 > > 32 > > 33 > > 34 > > 35 > > 36 > > 37 > > 38 > > 39 > > 40 > > 41 > > 42 > > 43 > > 44 > > 45 > > 46 > > 47 > > 48 > > 49 > > 50 > > 51 > > 52 > > 53 > > 54 > > 55 > > 56 > > 57 > > 58 > > 59 > > 60 > > 61 > > 62 > > > > 1001 > > Doc 1 > > I like nokia mobile > > > > > > doc2: > > > > > > 1002 > > Doc 2 > > I cant leave without cell phone > > > > > > doc3: > > > > > > 1003 > > Doc 3 > > I work with Nokia inc > > > > > > when i search for cell phone, I should get doc1 and doc2 returned but not > > doc3. > > > > The search syntax is : text: "cell phone"~500 > > > > > > How could i achieve this? > > > > > > > > Best Regards, > > Dinesh Naik > > > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England > -- Best Regards, Dinesh Naik
Synonym with Proximity search in solr 5.1.0
Hi, We have a synonym file with below content: 1 2 cell phone ,nokia mobile And we have 3 documents: doc1: 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 1001 Doc 1 I like nokia mobile doc2: 1002 Doc 2 I cant leave without cell phone doc3: 1003 Doc 3 I work with Nokia inc when i search for cell phone, I should get doc1 and doc2 returned but not doc3. The search syntax is : text: "cell phone"~500 How could i achieve this? Best Regards, Dinesh Naik
Location of config files in Zoo Keeper
Hi all, For solr version 5.1.0, Where does Zoo keeper keep all the config files ?How do we access them ? >From Admin console , Cloud-->Tree-->config , we are able to see them but where does Zoo Keeper store them(location)? -- Best Regards, Dinesh Naik
RE: Reading indexed data from solr 5.1.0 using admin/luke?
Hi Alessandro, Lets say I have 20M documents with 50 fields in each. I have applied text analysis like compression,ngram,synonym expansion on these fields. Checking individually field level analysis can be easily done via admin/analysis . But I need to do 50 times analysis check for these 50 fields . I wanted to know if solr provides a way to see all these analyzed fields at once (for ex. By using unique id ). Best Regards, Dinesh Naik -Original Message- From: "Alessandro Benedetti" Sent: 30-06-2015 21:43 To: "solr-user@lucene.apache.org" Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke? But what do you mean with the complete document ? Is it not available anymore ? So you have lost your original document and you want to try to reconstruct from the index ? 2015-06-30 16:05 GMT+01:00 dinesh naik : > Hi Alessandro, > I am able to check the field wise analyzed results. > > I was interested in getting the complete document. > > As Erick mentioned - > Reconstructing the doc from the > postings lists isactually quite tedious. The Luke program (not request > handler) has a > function that > does this, it's not fast though, more for troubleshooting than trying to do > anything in a production environment. > > I ll try looking into the Luke program if i can get this done. > > Thanks and Best Regards, > Dinesh Naik > > On Tue, Jun 30, 2015 at 7:42 PM, Alessandro Benedetti < > benedetti.ale...@gmail.com> wrote: > > > Do you have the original document available ? Or stored in the field of > > interest ? > > Should be quite an easy test to reproduce the Analysis simply using the > > analysis tool Upaya and Erick suggested. > > Just use your real document content and you will see how it is exactly > > analysed. > > > > Cheers > > > > 2015-06-30 15:03 GMT+01:00 dinesh naik : > > > > > Hi Erick, > > > > > > I agree with you. > > > > > > But i was checking if we could get hold on the whole document (to see > > all > > > analyzed field values) . > > > > > > There might be chances that field value is common for multiple > documents > > . > > > In such cases it will be difficult to backtrack which document has the > > > issue . Because admin/analysis can be used to see for field level > > analysis > > > only. > > > > > > > > > > > > Best Regards, > > > Dinesh Naik > > > > > > On Tue, Jun 30, 2015 at 7:08 PM, Erick Erickson < > erickerick...@gmail.com > > > > > > wrote: > > > > > > > Dinesh: > > > > > > > > This is what the admin/analysis page is for. It shows you exactly > > > > what tokens are produced by what steps in the analysis chain. > > > > That would be far better than trying to analyze the indexed > > > > terms. > > > > > > > > Best, > > > > Erick > > > > > > > > On Tue, Jun 30, 2015 at 8:35 AM, dinesh naik < > > dineshkumarn...@gmail.com> > > > > wrote: > > > > > Hi Erick, > > > > > This is mainly for debugging purpose. If i have 20M records and few > > > > fields > > > > > in some of the documents are not indexed as expected or something > > went > > > > > wrong during indexing then how do we pin point the exact issue and > > fix > > > > the > > > > > problem? > > > > > > > > > > > > > > > Best Regards, > > > > > Dinesh Naik > > > > > > > > > > On Tue, Jun 30, 2015 at 5:56 PM, Erick Erickson < > > > erickerick...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > >> In short, not unless you want to get into low-level Lucene coding. > > > > >> Inverted indexes are, well, inverted so their very structure makes > > > > >> this difficult. It looks like this: > > > > >> > > > > >> But I'm not convinced yet that this isn't an XY problem. What is > the > > > > >> high-level problem you're trying to solve here? Maybe there's > > another > > > > >> way to go about it. > > > > >> > > > > >> Best, > > > > >> Erick > > > > >> > > > > >> On Tue, Jun 30, 2015 at 3:32 AM, dinesh naik < > > > dineshkumarn...@gmail.com > > &g
Re: Reading indexed data from solr 5.1.0 using admin/luke?
Hi Alessandro, I am able to check the field wise analyzed results. I was interested in getting the complete document. As Erick mentioned - Reconstructing the doc from the postings lists isactually quite tedious. The Luke program (not request handler) has a function that does this, it's not fast though, more for troubleshooting than trying to do anything in a production environment. I ll try looking into the Luke program if i can get this done. Thanks and Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 7:42 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Do you have the original document available ? Or stored in the field of > interest ? > Should be quite an easy test to reproduce the Analysis simply using the > analysis tool Upaya and Erick suggested. > Just use your real document content and you will see how it is exactly > analysed. > > Cheers > > 2015-06-30 15:03 GMT+01:00 dinesh naik : > > > Hi Erick, > > > > I agree with you. > > > > But i was checking if we could get hold on the whole document (to see > all > > analyzed field values) . > > > > There might be chances that field value is common for multiple documents > . > > In such cases it will be difficult to backtrack which document has the > > issue . Because admin/analysis can be used to see for field level > analysis > > only. > > > > > > > > Best Regards, > > Dinesh Naik > > > > On Tue, Jun 30, 2015 at 7:08 PM, Erick Erickson > > > wrote: > > > > > Dinesh: > > > > > > This is what the admin/analysis page is for. It shows you exactly > > > what tokens are produced by what steps in the analysis chain. > > > That would be far better than trying to analyze the indexed > > > terms. > > > > > > Best, > > > Erick > > > > > > On Tue, Jun 30, 2015 at 8:35 AM, dinesh naik < > dineshkumarn...@gmail.com> > > > wrote: > > > > Hi Erick, > > > > This is mainly for debugging purpose. If i have 20M records and few > > > fields > > > > in some of the documents are not indexed as expected or something > went > > > > wrong during indexing then how do we pin point the exact issue and > fix > > > the > > > > problem? > > > > > > > > > > > > Best Regards, > > > > Dinesh Naik > > > > > > > > On Tue, Jun 30, 2015 at 5:56 PM, Erick Erickson < > > erickerick...@gmail.com > > > > > > > > wrote: > > > > > > > >> In short, not unless you want to get into low-level Lucene coding. > > > >> Inverted indexes are, well, inverted so their very structure makes > > > >> this difficult. It looks like this: > > > >> > > > >> But I'm not convinced yet that this isn't an XY problem. What is the > > > >> high-level problem you're trying to solve here? Maybe there's > another > > > >> way to go about it. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >> On Tue, Jun 30, 2015 at 3:32 AM, dinesh naik < > > dineshkumarn...@gmail.com > > > > > > > >> wrote: > > > >> > Thanks Eric and Upayavira for your inputs. > > > >> > > > > >> > Is there a way i can associate this to a unique id of document, > > either > > > >> > using schema browser or TermsComponent? > > > >> > > > > >> > Best Regards, > > > >> > Dinesh Naik > > > >> > > > > >> > On Tue, Jun 30, 2015 at 2:55 AM, Upayavira > wrote: > > > >> > > > > >> >> Use the schema browser on the admin UI, and click the "load term > > > info" > > > >> >> button. It'll show you the terms in your index. > > > >> >> > > > >> >> You can also use the analysis tab which will show you how it > would > > > >> >> tokenise stuff for a specific field. > > > >> >> > > > >> >> Upayavira > > > >> >> > > > >> >> On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote: > > > >> >> > Hi Eric, > > > >> >> > By compressed value I meant value of a field after removing > > special > > > >> >> > characters . In my example its "-". Compr
Re: Reading indexed data from solr 5.1.0 using admin/luke?
Hi Erick, I agree with you. But i was checking if we could get hold on the whole document (to see all analyzed field values) . There might be chances that field value is common for multiple documents . In such cases it will be difficult to backtrack which document has the issue . Because admin/analysis can be used to see for field level analysis only. Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 7:08 PM, Erick Erickson wrote: > Dinesh: > > This is what the admin/analysis page is for. It shows you exactly > what tokens are produced by what steps in the analysis chain. > That would be far better than trying to analyze the indexed > terms. > > Best, > Erick > > On Tue, Jun 30, 2015 at 8:35 AM, dinesh naik > wrote: > > Hi Erick, > > This is mainly for debugging purpose. If i have 20M records and few > fields > > in some of the documents are not indexed as expected or something went > > wrong during indexing then how do we pin point the exact issue and fix > the > > problem? > > > > > > Best Regards, > > Dinesh Naik > > > > On Tue, Jun 30, 2015 at 5:56 PM, Erick Erickson > > > wrote: > > > >> In short, not unless you want to get into low-level Lucene coding. > >> Inverted indexes are, well, inverted so their very structure makes > >> this difficult. It looks like this: > >> > >> But I'm not convinced yet that this isn't an XY problem. What is the > >> high-level problem you're trying to solve here? Maybe there's another > >> way to go about it. > >> > >> Best, > >> Erick > >> > >> On Tue, Jun 30, 2015 at 3:32 AM, dinesh naik > > >> wrote: > >> > Thanks Eric and Upayavira for your inputs. > >> > > >> > Is there a way i can associate this to a unique id of document, either > >> > using schema browser or TermsComponent? > >> > > >> > Best Regards, > >> > Dinesh Naik > >> > > >> > On Tue, Jun 30, 2015 at 2:55 AM, Upayavira wrote: > >> > > >> >> Use the schema browser on the admin UI, and click the "load term > info" > >> >> button. It'll show you the terms in your index. > >> >> > >> >> You can also use the analysis tab which will show you how it would > >> >> tokenise stuff for a specific field. > >> >> > >> >> Upayavira > >> >> > >> >> On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote: > >> >> > Hi Eric, > >> >> > By compressed value I meant value of a field after removing special > >> >> > characters . In my example its "-". Compressed form of red-apple is > >> >> > redapple . > >> >> > > >> >> > I wanted to know if we can see the analyzed version of fields . > >> >> > > >> >> > For example if I use ngram on a field , how do I see the analyzed > >> values > >> >> > in index ? > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -Original Message- > >> >> > From: "Erick Erickson" > >> >> > Sent: 29-06-2015 18:12 > >> >> > To: "solr-user@lucene.apache.org" > >> >> > Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke? > >> >> > > >> >> > Not quite sure what you mean by "compressed values". admin/luke > >> >> > doesn't show the results of the compression of the stored values, > >> there's > >> >> > no way I know of to do that. > >> >> > > >> >> > Best, > >> >> > Erick > >> >> > > >> >> > On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik < > >> dineshkumarn...@gmail.com> > >> >> > wrote: > >> >> > > Hi all, > >> >> > > > >> >> > > Is there a way to read the indexed data for field on which the > >> >> > > analysis/processing has been done ? > >> >> > > > >> >> > > I know using admin GUI we can see field wise analysis But how > can i > >> get > >> >> > > hold on the complete document using admin/luke? or any other way? > >> >> > > > >> >> > > For example, if i have 2 fields called name and compressedname. > >> >> > > > >> >> > > name has values like apple, green-apple,red-apple > >> >> > > compressedname has values like apple,greenapple,redapple > >> >> > > > >> >> > > Even though i make both these field indexed=true and stored=true > >> >> > > > >> >> > > I am not able to see the compressed values using > >> >> admin/luke?id= > >> >> > > > >> >> > > in response i see something like this- > >> >> > > > >> >> > > > >> >> > > > >> >> > > string > >> >> > > ITS-- > >> >> > > ITS-- > >> >> > > GREEN-APPLE > >> >> > > GREEN-APPLE > >> >> > > 1.0 > >> >> > > 0 > >> >> > > > >> >> > > > >> >> > > string > >> >> > > ITS-- > >> >> > > ITS-- > >> >> > > GREEN-APPLE > >> >> > > GREEN-APPLE > >> >> > > 1.0 > >> >> > > 0 > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > Best Regards, > >> >> > > Dinesh Naik > >> >> > >> > > >> > > >> > > >> > -- > >> > Best Regards, > >> > Dinesh Naik > >> > > > > > > > > -- > > Best Regards, > > Dinesh Naik > -- Best Regards, Dinesh Naik
Re: Reading indexed data from solr 5.1.0 using admin/luke?
Hi Erick, This is mainly for debugging purpose. If i have 20M records and few fields in some of the documents are not indexed as expected or something went wrong during indexing then how do we pin point the exact issue and fix the problem? Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 5:56 PM, Erick Erickson wrote: > In short, not unless you want to get into low-level Lucene coding. > Inverted indexes are, well, inverted so their very structure makes > this difficult. It looks like this: > > But I'm not convinced yet that this isn't an XY problem. What is the > high-level problem you're trying to solve here? Maybe there's another > way to go about it. > > Best, > Erick > > On Tue, Jun 30, 2015 at 3:32 AM, dinesh naik > wrote: > > Thanks Eric and Upayavira for your inputs. > > > > Is there a way i can associate this to a unique id of document, either > > using schema browser or TermsComponent? > > > > Best Regards, > > Dinesh Naik > > > > On Tue, Jun 30, 2015 at 2:55 AM, Upayavira wrote: > > > >> Use the schema browser on the admin UI, and click the "load term info" > >> button. It'll show you the terms in your index. > >> > >> You can also use the analysis tab which will show you how it would > >> tokenise stuff for a specific field. > >> > >> Upayavira > >> > >> On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote: > >> > Hi Eric, > >> > By compressed value I meant value of a field after removing special > >> > characters . In my example its "-". Compressed form of red-apple is > >> > redapple . > >> > > >> > I wanted to know if we can see the analyzed version of fields . > >> > > >> > For example if I use ngram on a field , how do I see the analyzed > values > >> > in index ? > >> > > >> > > >> > > >> > > >> > -Original Message- > >> > From: "Erick Erickson" > >> > Sent: 29-06-2015 18:12 > >> > To: "solr-user@lucene.apache.org" > >> > Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke? > >> > > >> > Not quite sure what you mean by "compressed values". admin/luke > >> > doesn't show the results of the compression of the stored values, > there's > >> > no way I know of to do that. > >> > > >> > Best, > >> > Erick > >> > > >> > On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik < > dineshkumarn...@gmail.com> > >> > wrote: > >> > > Hi all, > >> > > > >> > > Is there a way to read the indexed data for field on which the > >> > > analysis/processing has been done ? > >> > > > >> > > I know using admin GUI we can see field wise analysis But how can i > get > >> > > hold on the complete document using admin/luke? or any other way? > >> > > > >> > > For example, if i have 2 fields called name and compressedname. > >> > > > >> > > name has values like apple, green-apple,red-apple > >> > > compressedname has values like apple,greenapple,redapple > >> > > > >> > > Even though i make both these field indexed=true and stored=true > >> > > > >> > > I am not able to see the compressed values using > >> admin/luke?id= > >> > > > >> > > in response i see something like this- > >> > > > >> > > > >> > > > >> > > string > >> > > ITS-- > >> > > ITS-- > >> > > GREEN-APPLE > >> > > GREEN-APPLE > >> > > 1.0 > >> > > 0 > >> > > > >> > > > >> > > string > >> > > ITS-- > >> > > ITS-- > >> > > GREEN-APPLE > >> > > GREEN-APPLE > >> > > 1.0 > >> > > 0 > >> > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Best Regards, > >> > > Dinesh Naik > >> > > > > > > > > -- > > Best Regards, > > Dinesh Naik > -- Best Regards, Dinesh Naik
Re: Reading indexed data from solr 5.1.0 using admin/luke?
Thanks Eric and Upayavira for your inputs. Is there a way i can associate this to a unique id of document, either using schema browser or TermsComponent? Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 2:55 AM, Upayavira wrote: > Use the schema browser on the admin UI, and click the "load term info" > button. It'll show you the terms in your index. > > You can also use the analysis tab which will show you how it would > tokenise stuff for a specific field. > > Upayavira > > On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote: > > Hi Eric, > > By compressed value I meant value of a field after removing special > > characters . In my example its "-". Compressed form of red-apple is > > redapple . > > > > I wanted to know if we can see the analyzed version of fields . > > > > For example if I use ngram on a field , how do I see the analyzed values > > in index ? > > > > > > > > > > -Original Message- > > From: "Erick Erickson" > > Sent: 29-06-2015 18:12 > > To: "solr-user@lucene.apache.org" > > Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke? > > > > Not quite sure what you mean by "compressed values". admin/luke > > doesn't show the results of the compression of the stored values, there's > > no way I know of to do that. > > > > Best, > > Erick > > > > On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik > > wrote: > > > Hi all, > > > > > > Is there a way to read the indexed data for field on which the > > > analysis/processing has been done ? > > > > > > I know using admin GUI we can see field wise analysis But how can i get > > > hold on the complete document using admin/luke? or any other way? > > > > > > For example, if i have 2 fields called name and compressedname. > > > > > > name has values like apple, green-apple,red-apple > > > compressedname has values like apple,greenapple,redapple > > > > > > Even though i make both these field indexed=true and stored=true > > > > > > I am not able to see the compressed values using > admin/luke?id= > > > > > > in response i see something like this- > > > > > > > > > > > > string > > > ITS-- > > > ITS-- > > > GREEN-APPLE > > > GREEN-APPLE > > > 1.0 > > > 0 > > > > > > > > > string > > > ITS-- > > > ITS-- > > > GREEN-APPLE > > > GREEN-APPLE > > > 1.0 > > > 0 > > > > > > > > > > > > > > > -- > > > Best Regards, > > > Dinesh Naik > -- Best Regards, Dinesh Naik
RE: Reading indexed data from solr 5.1.0 using admin/luke?
Hi Eric, By compressed value I meant value of a field after removing special characters . In my example its "-". Compressed form of red-apple is redapple . I wanted to know if we can see the analyzed version of fields . For example if I use ngram on a field , how do I see the analyzed values in index ? -Original Message- From: "Erick Erickson" Sent: 29-06-2015 18:12 To: "solr-user@lucene.apache.org" Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke? Not quite sure what you mean by "compressed values". admin/luke doesn't show the results of the compression of the stored values, there's no way I know of to do that. Best, Erick On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik wrote: > Hi all, > > Is there a way to read the indexed data for field on which the > analysis/processing has been done ? > > I know using admin GUI we can see field wise analysis But how can i get > hold on the complete document using admin/luke? or any other way? > > For example, if i have 2 fields called name and compressedname. > > name has values like apple, green-apple,red-apple > compressedname has values like apple,greenapple,redapple > > Even though i make both these field indexed=true and stored=true > > I am not able to see the compressed values using admin/luke?id= > > in response i see something like this- > > > > string > ITS-- > ITS-- > GREEN-APPLE > GREEN-APPLE > 1.0 > 0 > > > string > ITS-- > ITS-- > GREEN-APPLE > GREEN-APPLE > 1.0 > 0 > > > > > -- > Best Regards, > Dinesh Naik
Reading indexed data from solr 5.1.0 using admin/luke?
Hi all, Is there a way to read the indexed data for field on which the analysis/processing has been done ? I know using admin GUI we can see field wise analysis But how can i get hold on the complete document using admin/luke? or any other way? For example, if i have 2 fields called name and compressedname. name has values like apple, green-apple,red-apple compressedname has values like apple,greenapple,redapple Even though i make both these field indexed=true and stored=true I am not able to see the compressed values using admin/luke?id= in response i see something like this- string ITS-- ITS-- GREEN-APPLE GREEN-APPLE 1.0 0 string ITS-- ITS-- GREEN-APPLE GREEN-APPLE 1.0 0 -- Best Regards, Dinesh Naik
Dynamic boosting on a document for Solr4.10.2
Hi , We are looking for an option to boost a document while indexing based on the values of certain field. For example : lets say we have 10 documents with fields say- name,acc no, status, age address etc. Now for documents with status 'Active' we want to boost by value 1000 and if status is 'Closed' we want to do negative boost say -100 . Also if age is between '20-50' we want to boost by 2000 etc. Please let us know how can we achieve this ? -- Best Regards, Dinesh Naik
RE: How to achieve lemmatization for english words in Solr 4.10.2
Hi Jack, We are looking for something like this- For example if you search for a text -go We should also get other forms of this text like going,gone,goes etc. This is not being achieved via stemming. -Original Message- From: "Jack Krupansky" Sent: 18-02-2015 21:50 To: "solr-user@lucene.apache.org" Subject: Re: How to achieve lemmatization for english words in Solr 4.10.2 Please provide a few examples that illustrate your requirements. Specifically, requirements that are not met by the existing Solr stemming filters. What is your specific goal? -- Jack Krupansky On Wed, Feb 18, 2015 at 10:50 AM, dinesh naik wrote: > Hi, > IS there a way to achieve lemmatization in Solr? Stemming option is not > meeting the requirement. > > -- > Best Regards, > Dinesh Naik >
How to achieve lemmatization for english words in Solr 4.10.2
Hi, IS there a way to achieve lemmatization in Solr? Stemming option is not meeting the requirement. -- Best Regards, Dinesh Naik
Internal document format for Solr 4.10.2
Hi, Is there a way to read the internal document once solr does the indexing ? Also is there a possibility to store this internal document in xml format ? -- Best Regards, Dinesh Naik
Better way of copying/backup of index in Solr 4.10.2
What is the best way for copying/backup of index in Solr 4.10.2? -- Best Regards, Dinesh Naik
Possibility of Indexing without feeding again in Solr 4.10.2
Hi all, How to can do re-indexing in Solr without importing the data again? Is there a way to do re-indexing only for few documents ? -- Best Regards, Dinesh Naik
American /British Dictionary for solr-4.10.2
Hi, What are the dictionaries available for Solr 4.10.2? We are looking for a dictionary to support American/British English synonym. -- Best Regards, Dinesh Naik
American British Dictionary for Solr
Hi , We are looking for a dictionary to support American/British English synonym. Could you please let us know what all dictionaries are available ? -- Best Regards, Dinesh Naik
RE: How to stop Solr tokenising search terms with spaces
Hi Ahmet, We have gone for the Ngram solution. Thanks Regards, Dinesh Babu. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] Sent: 08 December 2014 15:27 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces Hi, May be you have omitTermFreqAndPositions=true set for your fields? Positions are necessary for phrase queries to work. Ahmet On Monday, December 8, 2014 5:20 PM, Dinesh Babu wrote: Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu wrote: > I just tried your suggestion > > {!complexphrase}displayName:"RVN Viewpoint users" > > Even the above did not work. Am I missing any configuration changes for this > parser to work? What is the fieldType of displayName? The complexphrase query parser is only for "text" fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data > Regards, > Dinesh Babu. > > > > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: 07 December 2014 20:49 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu wrote: >> Thanks Yonik. This does not seem to work for me. This is wgat I did >> >> 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and >> (b) "RVN Project Admins" >> >> 2) {!complexphrase}"RVN*" --> Unknown query type >> \"org.apache.lucene.search.PrefixQuery\" found in phrase query string >> \"RVN*\"" > > Looks like you found a bug in this part... a prefix query being quoted when > it doesn't need to be. > >> 3) {!complexphrase}"RVN V*" -- Does not bring any result back. > > This type of query should work (and does for me). Is it because the default > search field does not have these terms, and you didn't specify a different > field to search? > Try this: > {!complexphrase}displayName:"RVN V*" > > -Yonik > http://heliosearch.org - native code faceting, facet functions, > sub-facets, off-heap data > > >
RE: How to stop Solr tokenising search terms with spaces
But my requirement is A* B* to be A* B* . A* OR B*won't meet my requirement. We have chosen the NGram solution and it is working for our rquirement at the moment. Thanks for your input and help Yonik Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 17:58 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher wrote: > debug output tells a lot. Looks like in the last two examples that the > second part (Viewpoint*) is NOT parsed with the complex phrase parser - the > whitespace thwarts it. Actually, it looks like it is, but you're not telling the complex phrase parser to put the two clauses in a phrase. You need the quotes. Even for complexphrase parser A* B* is the same as A* OR B* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Thanks Erik Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 17:02 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. I’d recommend doing something like this to test that parser out to avoid the “meta” parsing issue. q={!complexphrase v=$qq}&qq= Erik > On Dec 8, 2014, at 11:14 AM, Dinesh Babu wrote: > > Hi Erik, > > 1. With search phrase in quotes > {!complexphrase}displayName:"RVN Viewpoint*" > > "debug": { >"rawquerystring": "{!complexphrase}displayName:\"RVN Viewpoint*\"", >"querystring": "{!complexphrase}displayName:\"RVN Viewpoint*\"", >"parsedquery": "ComplexPhraseQuery(\"RVN Viewpoint*\")", >"parsedquery_toString": "\"RVN Viewpoint*\"", >"QParser": "ComplexPhraseQParser" > } > > 2. Tried with search phrase not in quotes: This brings result back but only > those starting with "viewpoint" and does not bring "rvn viewpoint" > > {!complexphrase}displayName:RVN Viewpoint* > > "debug": { >"rawquerystring": "{!complexphrase}displayName:RVN Viewpoint*", >"querystring": "{!complexphrase}displayName:RVN Viewpoint*", >"parsedquery": "displayName:rvn displayName:viewpoint*", >"parsedquery_toString": "displayName:rvn displayName:viewpoint*", >"QParser": "ComplexPhraseQParser" > } > > 3. {!complexphrase}displayName:RVN* Viewpoint* > > "debug": { >"rawquerystring": "{!complexphrase}displayName:RVN* Viewpoint*", >"querystring": "{!complexphrase}displayName:RVN* Viewpoint*", >"parsedquery": "displayName:rvn* displayName:viewpoint*", >"parsedquery_toString": "displayName:rvn* displayName:viewpoint*", >"QParser": "ComplexPhraseQParser" > } > > Regards, > Dinesh Babu. > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 08 December 2014 09:40 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > What's the parsed query? &debug=true > > >> On Dec 8, 2014, at 02:50, Dinesh Babu wrote: >> >> I just tried your suggestion >> >> {!complexphrase}displayName:"RVN Viewpoint users" >> >> Even the above did not work. Am I missing any configuration changes for this >> parser to work? >> >> Regards, >> Dinesh Babu. >> >> >> >> -Original Message- >> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley >> Sent: 07 December 2014 20:49 >> To: solr-user@lucene.apache.org >> Subject: Re: How to stop Solr tokenising search terms with spaces >> >>> On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu wrote: >>> Thanks Yonik. This does not seem to work for me. This is wgat I did >>> >>> 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and >>> (b) "RVN Project Admins" >>> >>> 2) {!complexphrase}"RVN*" --> Unknown query type >>> \"org.apache.lucene.search.PrefixQuery\" found in phrase query string >>> \"RVN*\"" >> >> Looks like you found a bug in this part... a prefix query being quoted when >> it doesn't need to be. >> >>> 3) {!complexphrase}"RVN V*" -- Does not bring any result back. >> >> This type of query should work (and does for me). Is it because the default >> search field does not have these terms, and you didn't specify a different >> field to search? >> Try this: >> {!complexphrase}displayName:"RVN V*" >> >> -Yonik >> http://heliosearch.org - native code faceting, facet functions, sub-facets, >> off-heap data >> >> >> > > >
RE: How to stop Solr tokenising search terms with spaces
Hi Erik, 1. With search phrase in quotes {!complexphrase}displayName:"RVN Viewpoint*" "debug": { "rawquerystring": "{!complexphrase}displayName:\"RVN Viewpoint*\"", "querystring": "{!complexphrase}displayName:\"RVN Viewpoint*\"", "parsedquery": "ComplexPhraseQuery(\"RVN Viewpoint*\")", "parsedquery_toString": "\"RVN Viewpoint*\"", "QParser": "ComplexPhraseQParser" } 2. Tried with search phrase not in quotes: This brings result back but only those starting with "viewpoint" and does not bring "rvn viewpoint" {!complexphrase}displayName:RVN Viewpoint* "debug": { "rawquerystring": "{!complexphrase}displayName:RVN Viewpoint*", "querystring": "{!complexphrase}displayName:RVN Viewpoint*", "parsedquery": "displayName:rvn displayName:viewpoint*", "parsedquery_toString": "displayName:rvn displayName:viewpoint*", "QParser": "ComplexPhraseQParser" } 3. {!complexphrase}displayName:RVN* Viewpoint* "debug": { "rawquerystring": "{!complexphrase}displayName:RVN* Viewpoint*", "querystring": "{!complexphrase}displayName:RVN* Viewpoint*", "parsedquery": "displayName:rvn* displayName:viewpoint*", "parsedquery_toString": "displayName:rvn* displayName:viewpoint*", "QParser": "ComplexPhraseQParser" } Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 09:40 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces What's the parsed query? &debug=true > On Dec 8, 2014, at 02:50, Dinesh Babu wrote: > > I just tried your suggestion > > {!complexphrase}displayName:"RVN Viewpoint users" > > Even the above did not work. Am I missing any configuration changes for this > parser to work? > > Regards, > Dinesh Babu. > > > > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley > Sent: 07 December 2014 20:49 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > >> On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu wrote: >> Thanks Yonik. This does not seem to work for me. This is wgat I did >> >> 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and >> (b) "RVN Project Admins" >> >> 2) {!complexphrase}"RVN*" --> Unknown query type >> \"org.apache.lucene.search.PrefixQuery\" found in phrase query string >> \"RVN*\"" > > Looks like you found a bug in this part... a prefix query being quoted when > it doesn't need to be. > >> 3) {!complexphrase}"RVN V*" -- Does not bring any result back. > > This type of query should work (and does for me). Is it because the default > search field does not have these terms, and you didn't specify a different > field to search? > Try this: > {!complexphrase}displayName:"RVN V*" > > -Yonik > http://heliosearch.org - native code faceting, facet functions, sub-facets, > off-heap data > > >
RE: How to stop Solr tokenising search terms with spaces
Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu wrote: > I just tried your suggestion > > {!complexphrase}displayName:"RVN Viewpoint users" > > Even the above did not work. Am I missing any configuration changes for this > parser to work? What is the fieldType of displayName? The complexphrase query parser is only for "text" fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data > Regards, > Dinesh Babu. > > > > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: 07 December 2014 20:49 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu wrote: >> Thanks Yonik. This does not seem to work for me. This is wgat I did >> >> 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and >> (b) "RVN Project Admins" >> >> 2) {!complexphrase}"RVN*" --> Unknown query type >> \"org.apache.lucene.search.PrefixQuery\" found in phrase query string >> \"RVN*\"" > > Looks like you found a bug in this part... a prefix query being quoted when > it doesn't need to be. > >> 3) {!complexphrase}"RVN V*" -- Does not bring any result back. > > This type of query should work (and does for me). Is it because the default > search field does not have these terms, and you didn't specify a different > field to search? > Try this: > {!complexphrase}displayName:"RVN V*" > > -Yonik > http://heliosearch.org - native code faceting, facet functions, > sub-facets, off-heap data > > >
RE: How to stop Solr tokenising search terms with spaces
Thanks a lot Jack. Will try this Solution. Regards, Dinesh Babu. -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: 07 December 2014 20:38 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces Thanks for the clarification. You may be able to get by using an ngram filter at index time - but not at query time. Then "Tom" would be indexed at position 0 as "to", "om", and "tom", and "Hanks" would be indexed at position 1 as "ha", "an", "nk", "ks", "han", "ank", "nks", "hank", "anks", and "hanks", permitting all of your queries, as unquoted terms or quoted simple phrases, such as "to ank". Use the standard tokenizer combined with the NGramFilterFactory and lower case filter, but only use the ngram filter at index time. See: http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html But be aware that use of the ngram filter dramatically increases the index size, so don't use it on large text fields, just short text fields like names. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Sunday, December 7, 2014 2:58 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Alex, My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 07 December 2014 01:20 To: solr-user Subject: Re: How to stop Solr tokenising search terms with spaces There is no spoon. And, there is no "phrase search". Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu wrote: > > Just curious, why solr does not provide a simple mechanism to do a > phrase search ? It is a very common use case and it is very surprising > that there is no straight forward, at least I have not found one after > so much research, way to do it in Solr. > > Regards, > Dinesh > > > -Original Message- > From: Dinesh Babu [mailto:dinesh.b...@pb.com] > Sent: 05 December 2014 17:29 > To: solr-user@lucene.apache.org > Subject: RE: How to stop Solr tokenising search terms with spaces > > Hi Erik, > > Probably I celebrated too soon. When I tested {!field} it seemed to > work as the query was on such a data that it made to look like it is > working. using the example that I originally mentioned to search for > Tom Hanks Major > > 1) If I search {!field f=displayName}: Hanks Major, it works > > 2) If I provide partial word {!field f=displayName}: Hanks Ma, it > does not work > > Is this how {!field is designed to work? > > Also I tried without and with escaping space as you suggested. It has > the same issue > > 1) q= field1:"Hanks Major" , it works > 2) q= field1:"Hanks Maj" , does not works > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 05 December 2014 16:44 > To: solr-user@lucene.apa
RE: How to stop Solr tokenising search terms with spaces
I just tried your suggestion {!complexphrase}displayName:"RVN Viewpoint users" Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu wrote: > Thanks Yonik. This does not seem to work for me. This is wgat I did > > 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and (b) > "RVN Project Admins" > > 2) {!complexphrase}"RVN*" --> Unknown query type > \"org.apache.lucene.search.PrefixQuery\" found in phrase query string > \"RVN*\"" Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. > 3) {!complexphrase}"RVN V*" -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:"RVN V*" -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) "RVN Viewpoint Users" and (b) "RVN Project Admins" 2) {!complexphrase}"RVN*" --> Unknown query type \"org.apache.lucene.search.PrefixQuery\" found in phrase query string \"RVN*\"" 3) {!complexphrase}"RVN V*" -- Does not bring any result back. 4) {!complexphrase}"RVN Viewpoint*" -- Does not bring any result back. Do I need to make any configuration changes to get this working? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 03:30 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sat, Dec 6, 2014 at 7:17 PM, Dinesh Babu wrote: > Just curious, why solr does not provide a simple mechanism to do a phrase > search ? Simple phrase queries: q= field1:"Hanks Major" Phrase queries with wildcards / partial matches are a different story... they are "complex": q={!complexphrase}"hanks ma*" See more examples here: http://heliosearch.org/solr-4-8-features/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Hi Jack, Reproducing the email that specifies my requirement. My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu Regards, Dinesh Babu. -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: 07 December 2014 02:04 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces AFAIK, partial word matching is not a common use case. Could you provide a citation to shows otherwise? Solr does provide a "simple mechanism" for "phrase search" - just place your phrase in quotes. If you wish to do something more complex, then of course the solution may be more complex. The starting point would be for you to provide a more complete description of your use case, which is clearly not "simple phrase search". Your most recent messages suggested that you want to match on partial words, but... you need to be more specific - don't make us try to guess your requirements. Feeding us partial requirements, one partial requirement at a time is not particularly effective. Finally, are you really trying to match names within arbitrary text, or do you have a field that simply contains a complete name? Again, this comes back to providing us with more specific requirements. My guess, from your mention of LDAP, is that the field would contain only a name, but... that's me guessing when you need to be specific. Once this distinction is cleared up, we can then focus on solutions that work either for arbitrary text or single value fields. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Saturday, December 6, 2014 7:17 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:"Hanks Major" , it works 2) q= field1:"Hanks Maj" , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik > On Dec 5, 2014, at 11:30 AM, Dinesh Babu wrote: > > One more quick question Erik, > > If I want to do search on multiple fields using {!field} do we have a > query similar to what {!prefix} has > : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where > &f1_val=&f2_val= > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Dinesh Babu > Sent: 05 December 2014 16:26 > To: solr-user@lucene.apache.org > Subject: RE: How to stop Solr tokenising search terms with spaces > > Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your > help > > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 05 December 2014 16:00 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > try using {!field} instead of {!prefix}. {!field} will create a phrase > query (or term query if it’s j
RE: How to stop Solr tokenising search terms with spaces
Hi Alex, My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 07 December 2014 01:20 To: solr-user Subject: Re: How to stop Solr tokenising search terms with spaces There is no spoon. And, there is no "phrase search". Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu wrote: > > Just curious, why solr does not provide a simple mechanism to do a phrase > search ? It is a very common use case and it is very surprising that there is > no straight forward, at least I have not found one after so much research, > way to do it in Solr. > > Regards, > Dinesh > > > -Original Message- > From: Dinesh Babu [mailto:dinesh.b...@pb.com] > Sent: 05 December 2014 17:29 > To: solr-user@lucene.apache.org > Subject: RE: How to stop Solr tokenising search terms with spaces > > Hi Erik, > > Probably I celebrated too soon. When I tested {!field} it seemed to > work as the query was on such a data that it made to look like it is > working. using the example that I originally mentioned to search for > Tom Hanks Major > > 1) If I search {!field f=displayName}: Hanks Major, it works > > 2) If I provide partial word {!field f=displayName}: Hanks Ma, it > does not work > > Is this how {!field is designed to work? > > Also I tried without and with escaping space as you suggested. It has > the same issue > > 1) q= field1:"Hanks Major" , it works > 2) q= field1:"Hanks Maj" , does not works > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 05 December 2014 16:44 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > But also, to spell out the more typical way to do that: > >q=field1:”…” OR field2:”…” > > The nice thing about {!field} is that the value doesn’t have to have quotes > and deal with escaping issues, but if you just want phrase queries and > quote/escaping isn’t a hassle maybe that’s cleaner for you. > > Erik > > >> On Dec 5, 2014, at 11:30 AM, Dinesh Babu wrote: >> >> One more quick question Erik, >> >> If I want to do search on multiple fields using {!field} do we have a >> query similar to what {!prefix} has >> : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} >> where &f1_val=&f2_val= >> >> Regards, >> Dinesh Babu. >> >> >> >> -Original Message- >> From: Dinesh Babu >> Sent: 05 December 2014 16:26 >> To: solr-user@lucene.apache.org >> Subject: RE: How to stop Solr tokenising search terms with spaces >> >> Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate >> your help >> >> >> Regards, >> Dinesh Babu. >> >> >> >> -Original Message- >> From: Erik Hatcher [mailto:erik.hatc...@gmail.com] >> Sent: 05 December 2014 16:00 >> To: solr-user@lucene.
RE: How to stop Solr tokenising search terms with spaces
Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:"Hanks Major" , it works 2) q= field1:"Hanks Maj" , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik > On Dec 5, 2014, at 11:30 AM, Dinesh Babu wrote: > > One more quick question Erik, > > If I want to do search on multiple fields using {!field} do we have a query > similar to what {!prefix} has > : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where > &f1_val=&f2_val= > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Dinesh Babu > Sent: 05 December 2014 16:26 > To: solr-user@lucene.apache.org > Subject: RE: How to stop Solr tokenising search terms with spaces > > Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your > help > > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 05 December 2014 16:00 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > try using {!field} instead of {!prefix}. {!field} will create a phrase query > (or term query if it’s just one term) after analysis. [it also could > construct other query types if the analysis overlaps tokens, but maybe not > relevant here] > > Also note that you can use multiple of these expressions if needed: > q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where > &f1_val=&f2_val= > >Erik > > > >> On Dec 5, 2014, at 10:45 AM, Dinesh Babu wrote: >> >> Hi, >> >> We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to >> tokenise my search term which has space in it Eg: If there is a user by the >> name Tom Hanks Major, then >> >> 1) When I do a query for " Tom Hanks Major " , I don't want solr break this >> search phrase and search for individual words (ie, Tom ,Hanks, Major), but >> search for the whole phrase and get me the Tom Hanks Major user >> >> 2) Also if I query for "Hanks Major" I should get the Tom Hanks Major user >> back >> >> We used !prefix, but that does no allow the scenario 2. Also !prefix will >> restrict the search to one field and can't do on mutiple fields. Any >> solutions? >> >> Regards, >> Dinesh Babu. >> >> >> > > > >
RE: How to stop Solr tokenising search terms with spaces
Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:"Hanks Major" , it works 2) q= field1:"Hanks Maj" , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik > On Dec 5, 2014, at 11:30 AM, Dinesh Babu wrote: > > One more quick question Erik, > > If I want to do search on multiple fields using {!field} do we have a query > similar to what {!prefix} has > : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where > &f1_val=&f2_val= > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Dinesh Babu > Sent: 05 December 2014 16:26 > To: solr-user@lucene.apache.org > Subject: RE: How to stop Solr tokenising search terms with spaces > > Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your > help > > > Regards, > Dinesh Babu. > > > > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: 05 December 2014 16:00 > To: solr-user@lucene.apache.org > Subject: Re: How to stop Solr tokenising search terms with spaces > > try using {!field} instead of {!prefix}. {!field} will create a phrase query > (or term query if it’s just one term) after analysis. [it also could > construct other query types if the analysis overlaps tokens, but maybe not > relevant here] > > Also note that you can use multiple of these expressions if needed: > q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where > &f1_val=&f2_val= > >Erik > > > >> On Dec 5, 2014, at 10:45 AM, Dinesh Babu wrote: >> >> Hi, >> >> We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to >> tokenise my search term which has space in it Eg: If there is a user by the >> name Tom Hanks Major, then >> >> 1) When I do a query for " Tom Hanks Major " , I don't want solr break this >> search phrase and search for individual words (ie, Tom ,Hanks, Major), but >> search for the whole phrase and get me the Tom Hanks Major user >> >> 2) Also if I query for "Hanks Major" I should get the Tom Hanks Major user >> back >> >> We used !prefix, but that does no allow the scenario 2. Also !prefix will >> restrict the search to one field and can't do on mutiple fields. Any >> solutions? >> >> Regards, >> Dinesh Babu. >> >> >> > > > >
RE: How to stop Solr tokenising search terms with spaces
Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where &f1_val=&f2_val= Erik > On Dec 5, 2014, at 10:45 AM, Dinesh Babu wrote: > > Hi, > > We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to > tokenise my search term which has space in it Eg: If there is a user by the > name Tom Hanks Major, then > > 1) When I do a query for " Tom Hanks Major " , I don't want solr break this > search phrase and search for individual words (ie, Tom ,Hanks, Major), but > search for the whole phrase and get me the Tom Hanks Major user > > 2) Also if I query for "Hanks Major" I should get the Tom Hanks Major user > back > > We used !prefix, but that does no allow the scenario 2. Also !prefix will > restrict the search to one field and can't do on mutiple fields. Any > solutions? > > Regards, > Dinesh Babu. > > >
RE: How to stop Solr tokenising search terms with spaces
One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where &f1_val=&f2_val= Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where &f1_val=&f2_val= Erik > On Dec 5, 2014, at 10:45 AM, Dinesh Babu wrote: > > Hi, > > We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to > tokenise my search term which has space in it Eg: If there is a user by the > name Tom Hanks Major, then > > 1) When I do a query for " Tom Hanks Major " , I don't want solr break this > search phrase and search for individual words (ie, Tom ,Hanks, Major), but > search for the whole phrase and get me the Tom Hanks Major user > > 2) Also if I query for "Hanks Major" I should get the Tom Hanks Major user > back > > We used !prefix, but that does no allow the scenario 2. Also !prefix will > restrict the search to one field and can't do on mutiple fields. Any > solutions? > > Regards, > Dinesh Babu. > > >
How to stop Solr tokenising search terms with spaces
Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for " Tom Hanks Major " , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for "Hanks Major" I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: Getting started with writing parser
no i actually changed the directory to mine where i stored the log files.. it is /home/exam/apa..solr/example/exampledocs i specified it in a solr schema.. i created an DataImportHandler for that in try.xml.. then in that i changed that file name to sample.txt that new try.xml is http://pastebin.com/pfVVA7Hs i changed the log into one word per line thinking there might be error in my regex expression.. now i'm completely stuck.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327920.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: please help >>Problem with dataImportHandler
http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327738.html this thread explains my problem - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2327745.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i want to take the month, time, DHCPMESSAGE, from_mac, gateway_ip, net_ADDR - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327738.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: please help >>Problem with dataImportHandler
ya after correcting it also it is throwing an exception - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2327662.html Sent from the Solr - User mailing list archive at Nabble.com.
Indexing Failed rolled back
i did some research in schema DIH config file and i created my own DIH, i'm getting this error when i run − 0 0 − − try.xml full-import idle − 0:0:0.163 0 1 0 0 2011-01-25 13:56:48 Indexing failed. Rolled back all changes. 2011-01-25 13:56:48 − This response format is experimental. It is likely to change in the future. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Failed-rolled-back-tp2327412p2327412.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i don't even know whether the regex expression that i'm using for my log is correct or no.. i very much worried i couldn't proceed in my project already 1/3 rd of the timing is over.. please help.. this is just the first stage.. after this i have ti setup up all the log to be redirected to SYSLOG and from there i'll send it to SOLR server.. then i have to analyse all the data's that i obtained from DNS, DHCP, WIFI, SWITCES.. and i have to prepare a user based report on his actions.. please help me cause the day's i have keeps reducing.. my project leader is questioning me a lot.. pls.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2326917.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: please help >>Problem with dataImportHandler
http://pastebin.com/tjCs5dHm this is the log produced by the solr server - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2326659.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
http://pastebin.com/CkxrEh6h this is my sample log - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2326646.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: please help >>Problem with dataImportHandler
its a DHCP log.. i want ti index it - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2319627.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: please help >>Problem with dataImportHandler
actually its a log file i seperately created an handler for that... its not XML - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2318617.html Sent from the Solr - User mailing list archive at Nabble.com.
please help >>Problem with dataImportHandler
this is the error that i'm getting.. no idea of what is it.. /apache-solr-1.4.1/example/exampledocs# java -jar post.jar sample.txt SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file sample.txt SimplePostTool: FATAL: Solr returned an error: Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_information_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_null___orgapachesolrhandlerdataimportDataImportHandlerException_Exception_occurred_while_initializing_context__at_orgapachesolrhandlerdataimportDataImporterloadDataConfigDataImporterjava190__at_orgapachesolrhandlerdataimportDataImporterinitDataImporterjava101__at_orgapachesolrhandlerdataimportDataImportHandlerinformDataImportHandlerjava113__at_orgapachesolrcoreSolrResourceLoaderinformSolrResourceLoaderjava508__at_orgapachesolrcoreSolrCoreinitSolrCorejava588__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava137__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava83__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500__at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhan root@karunya-desktop:/home/karunya/apache-solr-1.4.1/example/exampledocs# - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2318585.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
my solrconfig.xml http://pastebin.com/XDg0L4di my schema.xml http://pastebin.com/3Vqvr3C0 my try.xml http://pastebin.com/YWsB37ZW - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2318218.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i did all the configurations correctly.. previously i missed a configuration file after adding it i'm getting a new error called Unknown FieldType: 'string' used in QueryElevationComponent i found it was defined in solrconfig.xml i didn't change any of the line in that but i don't know why am i getting error - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317618.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i tried those examples.. is it compuslory that i should make it into XML, how does it index CSV.. should i post my entire schema that i made it myself and the text file that i tried to index.. - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting started with writing parser
i tried editing the schema file and indexing my own log.. the error that i got is root@karunya-desktop:/home/karunya/apache-solr-1.4.1/example/exampledocs# java -jar post.jar sample.txt SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file sample.txt SimplePostTool: FATAL: Solr returned an error: Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_information_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_null___orgapachesolrcommonSolrException_Unknown_fieldtype_text_specified_on_field_month__at_orgapachesolrschemaIndexSchemareadSchemaIndexSchemajava477__at_orgapachesolrschemaIndexSchemainitIndexSchemajava95__at_orgapachesolrcoreSolrCoreinitSolrCorejava520__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava137__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava83__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500__at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerWrapperdoStartHandlerWrapperjava117__at_orgmortbayjettyServerdoStartServerjava210__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayxmlXmlConfigurationmain please help me solve this - DINESHKUMAR . M I am neither especially clever nor especially gifted. I am only very, very curious. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2317421.html Sent from the Solr - User mailing list archive at Nabble.com.
Getting started with writing parser
how to write a parser program that will convert log files into XML.. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2278092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Regex DataImportHandler
ya i did.. i'm trying it.. still for a better solution i asked... -- View this message in context: http://lucene.472066.n3.nabble.com/Regex-DataImportHandler-tp2240084p2240295.html Sent from the Solr - User mailing list archive at Nabble.com.
Regex DataImportHandler
Can anyone explain me how to create regex DataImportHandler.. -- View this message in context: http://lucene.472066.n3.nabble.com/Regex-DataImportHandler-tp2240084p2240084.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Input raw log file
i got some idea like creating a DIH and then doing with that.. thanks every one for the help.. hope i'll create an regex DIH i guess that's right.. -- View this message in context: http://lucene.472066.n3.nabble.com/Input-raw-log-file-tp2210043p2239947.html Sent from the Solr - User mailing list archive at Nabble.com.