How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Alok Bhandari
Hello , I am using Solr 5.2. I have a field defined as "string" field type. It have some values in it like DOC-1 => abc ".. I am " not ? test DOC-2 => abc ".. This is the single string , I want to query all documents which exactly match this string i.e. it should return me only DOC-1 when I

Re: realtime get requirements

2016-01-13 Thread Alessandro Benedetti
Hi Matteo, which Solr version are you using ? Prior to 5.1 , the building of the suggester was happening by default on startup, causing long waiting times ( https://issues.apache.org/jira/browse/SOLR-6845 ) . If you are on a Solr >=5.1 I highly discourage the use of buildOnStartup=true if not a

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Binoy Dalal
Just query the string field and nothing else. String fields only return on exact match. On Wed, 13 Jan 2016, 16:52 Alok Bhandari wrote: > Hello , > > I am using Solr 5.2. > > I have a field defined as "string" field type. It have some values in it > like > >

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Binoy Dalal
No. On Wed, 13 Jan 2016, 16:58 Alok Bhandari wrote: > Hi Binoy thanks. > > But does it matter which query-parser I use , shall I use "lucene" parser > or > "edismax" parser. > > > > -- > View this message in context: >

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Alok Bhandari
Hi Binoy thanks. But does it matter which query-parser I use , shall I use "lucene" parser or "edismax" parser. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-achieve-exact-string-match-query-which-includes-spaces-and-quotes-tp4250402p4250405.html Sent from the

solr BooleanClauses issue with space

2016-01-13 Thread sara hajili
hi all, what is exactly diffrence between sapce and OR in solr query ? i mean what is diffrence between q = solr OR lucene OR search and this q = solr lucene search? solr default boolean occurence is OR,isn't it?

Re: Kerberos ticket not renewing when storing index on Kerberized HDFS

2016-01-13 Thread Andrew Bumstead
Thanks Ishan, I've raised a JIRA for it. On 11 January 2016 at 20:17, Ishan Chattopadhyaya wrote: > Not sure how reliably renewals are taken care of in the context of > kerberized HDFS, but here's my 10-15 minute analysis. > Seems to me that the auto renewal thread

RE: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-13 Thread Gian Maria Ricci - aka Alkampfer
Thanks. -- Gian Maria Ricci Cell: +39 320 0136949 -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: lunedì 11 gennaio 2016 18:28 To: solr-user@lucene.apache.org Subject: Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica On 1/11/2016 4:28

Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-13 Thread Shivaji Dutta
- SolrCloud uses zookeeper to manage HA - Zookeeper is a standard for all HA in Apache Hadoop - You have collections which will manage your shards across nodes - SolrJ Client is now fault tolerant with CloudSolrClient This is the way future direction of the product will go. On 1/13/16,

RE: Change leader in SolrCloud

2016-01-13 Thread Gian Maria Ricci - aka Alkampfer
Thanks. -- Gian Maria Ricci Cell: +39 320 0136949 -Original Message- From: Alessandro Benedetti [mailto:abenede...@apache.org] Sent: martedì 12 gennaio 2016 10:52 To: solr-user@lucene.apache.org Subject: Re: Change leader in SolrCloud I would like to do a special mention of the

Re: ConcurrentUpdateSolrClient vs CloudSolrClient for bulk update to SolrCloud

2016-01-13 Thread Shivaji Dutta
Erik and Shawn Thanks for the input. In the process below we are posting the documents to Solr over HTTP Connection in batches. Trying to solve the same problem but in a different way :- I have used lucene back in the day, where I would index the documents locally on the disk and run search

Re: ConcurrentUpdateSolrClient vs CloudSolrClient for bulk update to SolrCloud

2016-01-13 Thread Erick Erickson
My first thought is "yes, you're overthinking it" ;) Here's something to get you started for indexing through a Java program: https://cwiki.apache.org/confluence/display/solr/Using+SolrJ Of course you _could_ use Lucene to build your indexes and just copy them "to the right place", but there

Re: ConcurrentUpdateSolrClient vs CloudSolrClient for bulk update to SolrCloud

2016-01-13 Thread Toke Eskildsen
Shivaji Dutta wrote: > If I have a repository of millions of documents, would it not make sense > to just index them locally and then copy the index file over to Solr and > have it read from it? It is certainly possible and for some scenarios it will work well. We do it

Re: solr BooleanClauses issue with space

2016-01-13 Thread Doug Turnbull
Paste your Solr query into here http://splainer.io and it will help you debug your scoring/matching (shameless plug I made this thing) Also I suspect you may be using edismax. In which case the inclusion/exclusion of explicit ORs might turn a query from a big OR query into possibly a dismax query

Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-13 Thread Jack Krupansky
The "Legacy Scaling and Distribution" section of the Solr Reference Guide also gives info elated to so-called master-slave mode: https://cwiki.apache.org/confluence/display/solr/Legacy+Scaling+and+Distribution Also, although the old master-slave mode is still technically supported in the sense

Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-13 Thread Bernd Fehling
SolrCloud has some disadvantages and can't beat the easiness and simpleness of Master Slave Replica. So I can only encourage to keep Master Slave Replica in future versions. Bernd Am 13.01.2016 um 21:57 schrieb Jack Krupansky: > The "Legacy Scaling and Distribution" section of the Solr Reference

Re: solr error

2016-01-13 Thread Binoy Dalal
OK. Post the entire stack trace. That way we can get an idea of what is actually throwing this exception. On Thu, 14 Jan 2016, 12:49 Midas A wrote: > when we are using solr only > > On Thu, Jan 14, 2016 at 12:41 PM, Binoy Dalal > wrote: > > > Can

how to add new node in sole cloud cluster

2016-01-13 Thread Zap Org
in running solr cloud cluster how to add new node without disturbing the running cluster

Re: Error while reloading collection

2016-01-13 Thread Binoy Dalal
1) Ensure that the class file is actually present at the path you've given. 2) Post the entire stack trace of the exception. You can get that from the solr log. On Thu, 14 Jan 2016, 12:35 davidphilip cherian wrote: > You should probably ask this question here > > >

Re: Error while reloading collection

2016-01-13 Thread davidphilip cherian
You should probably ask this question here http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.7.0/Cloudera-Manager-Introduction/cmi_getting_help_and_support.html On Thu, Jan 14, 2016 at 12:11 PM, vidya wrote: > Hi > I am using solrcloud on cloudera

Re: solr error

2016-01-13 Thread Midas A
when we are using solr only On Thu, Jan 14, 2016 at 12:41 PM, Binoy Dalal wrote: > Can you post the entire stack trace? > > Do you get this error at startup or while you're using solr? > > On Thu, 14 Jan 2016, 12:38 Midas A wrote: > > > we are

Error while reloading collection

2016-01-13 Thread vidya
Hi I am using solrcloud on cloudera cluster. I have created collections using solrctl command which is supported by cloudera search tool. I included one class of java in schema.xml for creating a field type which is dependent on a jar that i have included in solrconfig.xml. But when i reload that

solr error

2016-01-13 Thread Midas A
we are continuously getting the error "null:org.eclipse.jetty.io.EofException" on slave . what could be the reason ?

Re: how to add new node in sole cloud cluster

2016-01-13 Thread Zap Org
i have 2 nodes where one got down and after restarting the server it shows error in initializing solrconfig.xml On Thu, Jan 14, 2016 at 12:45 PM, Zap Org wrote: > in running solr cloud cluster how to add new node without disturbing the > running cluster >

error in initializing solrconfig.xml

2016-01-13 Thread Zap Org
i have 2 running solr nodes in my cluster one node hot down. i restarted tomcat server and its throughing exception for initializing solrconfig.xml and didnot recognize collection

Re: degrades qtime in a 20million doc collection

2016-01-13 Thread Jack Krupansky
I recall a couple of previous discussions regarding some sort of filter/field cache change in Lucene where they removed what had been an optimization for Solr. -- Jack Krupansky On Wed, Jan 13, 2016 at 8:10 PM, Erick Erickson wrote: > It's quite surprising that you're

Re: solr error

2016-01-13 Thread Binoy Dalal
Can you post the entire stack trace? Do you get this error at startup or while you're using solr? On Thu, 14 Jan 2016, 12:38 Midas A wrote: > we are continuously getting the error > "null:org.eclipse.jetty.io.EofException" > on slave . > > what could be the reason ? > --

How to configure authentication in windows start script?

2016-01-13 Thread Kristine Jetzke
Hi, I am using an authentication plugin in my Solr 5.4 standalone installation running on Windows. How do I pass authentication options to the start script? In the Linux script is an option called AUTHC_CLIENT_CONFIGURER_ARG but I don't find anything similar for Windows... Thanks, tine

Re: Solr Heap memory vs. OS memory

2016-01-13 Thread Shawn Heisey
On 1/13/2016 2:25 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: > Followup question: > > If one has multiple instances on the same host (a host running basically > nothing except multiple instances of Solr), then the values specified as -Xmx > in the various instances should add up to 25% of the

FieldCache

2016-01-13 Thread Lewin Joy (TMS)
Hi, I have been facing a weird issue in solr. I am working on Solr 4.10.3 on Cloudera CDH 5.4.4 and am trying to group results on a multivalued field, let's say "interests". This is giving me an error message below: "error": { "msg": "can not use FieldCache on multivalued field:

Re: degrades qtime in a 20million doc collection

2016-01-13 Thread Anria B.
hi Shawn Thanks for the quick answer. As for the q=*, we also saw similar results in our testing when doing things like q=somefield:qval =otherfield:fqval Which makes a pure Lucene query. I simplified things somewhat since our results were always that as numFound got large, the query time

Error: FieldCache on multivalued field

2016-01-13 Thread Lewin Joy (TMS)
*updated subject line Hi, I have been facing a weird issue in solr. I am working on Solr 4.10.3 on Cloudera CDH 5.4.4 and am trying to group results on a multivalued field, let's say "interests". This is giving me an error message below: "error": { "msg": "can not use FieldCache on

solrcould is killed,restart,colletion don't work

2016-01-13 Thread 李铁峰
i’m a solr user, i use solrcloud 4.9.1 ,it running in tomcat , when tomcat is killd , two collectiones has error, solr admin picture is: each collection only have one shard . how can i repair it, let collection work, i do not want to restart tomcat ,and I can accept some data loss. error

Re: solr-5.3.1 admin console not show properly

2016-01-13 Thread Jan Høydahl
Which brand and version of Java have you installed? Looks like you run Solr as root? Should work, but not recommended. Try installing and running as an ordinary user. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 13. jan. 2016 kl. 17.01 skrev David Cao

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Scott Stults
This might be a good case for the Raw query parser (I haven't used it myself). https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-RawQueryParser k/r, Scott On Wed, Jan 13, 2016 at 12:05 PM, Erick Erickson wrote: > what _does_ matter is

RE: Solr Heap memory vs. OS memory

2016-01-13 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Followup question: If one has multiple instances on the same host (a host running basically nothing except multiple instances of Solr), then the values specified as -Xmx in the various instances should add up to 25% of the RAM of the host... Is that correct? -Original Message- From:

degrades qtime in a 20million doc collection

2016-01-13 Thread Anria B.
hi all, I have a Really fun question to ask. I'm sitting here looking at what is by far the beefiest box I've ever seen in my life. 256GB of RAM, extreme TerraBytes of disc space, the works. Linux server properly partitioned Yet, what we are seeing goes against all intuition I've built up

Re: degrades qtime in a 20million doc collection

2016-01-13 Thread Shawn Heisey
On 1/13/2016 3:01 PM, Anria B. wrote: > I have a Really fun question to ask. I'm sitting here looking at what is by > far the beefiest box I've ever seen in my life. 256GB of RAM, extreme > TerraBytes of disc space, the works. Linux server properly partitioned > > Yet, what we are seeing goes

Re: degrades qtime in a 20million doc collection

2016-01-13 Thread Erick Erickson
It's quite surprising that you're getting this kind of query degradation by adding an "fq" clause unless something's really out of whack on the setup. How much memory are you giving the JVM? Are you autowarming? Are you indexing while this is going on, and if what are your commit parameters? If

Re: solrcould is killed,restart,colletion don't work

2016-01-13 Thread Erick Erickson
Please outline all the steps you've done. Did you stop tomcat then restart it? On one or more machines? You have to have all the Solrs running that you did when you created the collections Best, Erick On Wed, Jan 13, 2016 at 1:28 PM, 李铁峰 wrote: > > i’m a solr user, i use

Re: Setting of ramBufferSizeMB

2016-01-13 Thread Erick Erickson
ramBufferSizeMB is a _limit_ that flushes the buffer when it is reached (actually, I think, it indexes a doc _then_ checks the size and if it's > the setting, flushes the buffer. So technically you can exceed the buffer size by your biggest doc's addition to the index). But I digress. This is a

Searching for Chinese characters is much slower

2016-01-13 Thread Zheng Lin Edwin Yeo
Hi, I'm using Solr 5.4.0, and the HMMChineseTokenizerFactory for my content indexed from rich-text documents. I found that during my search, the search for Chinese characters is much longer than English characters. The English characters usually can be returned in less than 200ms, but Chinese

Setting of ramBufferSizeMB

2016-01-13 Thread Zheng Lin Edwin Yeo
Hi, I would like to check, if I have make the following settings for ramBufferSizeMB, and I am using TieredMergePolicy, am I supposed to get each segment size of at least 320MB? 320 10 10 10240 I have this setting in my

Re: Setting of ramBufferSizeMB

2016-01-13 Thread Zheng Lin Edwin Yeo
Hi Erick, Thanks for your reply. So those small segments that I found is probably due to a commit happening during that time? I also found that those small segments are created during the last indexing. If I start another batch of indexing, those small segments will probably be get merge

solr-5.3.1 admin console not show properly

2016-01-13 Thread David Cao
I installed and started solr following instructions from solr wiki as this ... (on a Redhat server) cd ~/ tar zxf /tmp/solr-5.3.1.tgz cd solr-5.3.1/bin ./solr start -f Solr starts fine. But when opening console in a browser (" http://server-ip:8983/solr/admin.html;), it shows a partially

Re: solr BooleanClauses issue with space

2016-01-13 Thread Shawn Heisey
On 1/13/2016 5:40 AM, sara hajili wrote: > what is exactly diffrence between sapce and OR in solr query ? > i mean what is diffrence between > q = solr OR lucene OR search > and this > q = solr lucene search? > > solr default boolean occurence is OR,isn't it? This depends on what the default

Re: solr BooleanClauses issue with space

2016-01-13 Thread Emir Arnautovic
Hi Sara, You can run your query (or smaller one) with debugQuery=true and see how it is rewritten. Thanks, Emir On 13.01.2016 16:01, sara hajili wrote: tnx. and my main question is about maxBooleanDefault in solr config. it is 1024 by default. and i have a edismax query with about 500 words

Re: collection reflection in resource manager node

2016-01-13 Thread Shawn Heisey
On 1/13/2016 12:38 AM, vidya wrote: > I have created a collection in one datanode on which solr server is deployed > say DN1. I am having another datanode on which solr server is deployed which > has resource manager service also running on it,say DN2. When i created a > collection using solrctl

Re: solr BooleanClauses issue with space

2016-01-13 Thread sara hajili
tnx. and my main question is about maxBooleanDefault in solr config. it is 1024 by default. and i have a edismax query with about 500 words in this way: q1= str1 OR str2 OR str3 ...OR strn it throws exception that cant't parse query too boolean clause. so if i changed maxBooleanDefault to 1500 it

Re: ConcurrentUpdateSolrClient vs CloudSolrClient for bulk update to SolrCloud

2016-01-13 Thread Erick Erickson
It's usually not all that difficult to write a multi-threaded client that uses CloudSolrClient, or even fire up multiple instances of the SolrJ client (assuming they can work on discreet sections of the documents you need to index). That avoids the problem Shawn alludes to. Plus other issues. If

Re: collection reflection in resource manager node

2016-01-13 Thread Erick Erickson
It looks like you're using Cloudera's CDH, is that true? In that case Cloudera support might be able to provide you with more info. Best, Erick On Wed, Jan 13, 2016 at 6:37 AM, Shawn Heisey wrote: > On 1/13/2016 12:38 AM, vidya wrote: >> I have created a collection in one

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-13 Thread Erick Erickson
what _does_ matter is getting all that through the parser which means you have to enclose things in quotes and escape them. For instance, consider this query stringFIeld:abc "i am not" this will get parsed as stringField:abc defaultTextField:"i am not". To get around this you need to make sure