Re: Unable to create core [collection] Caused by: null

2017-07-25 Thread Rick Leir
Lucas, What is in the log before that stackdump? The stackdump seems to indicate that Solr is trying to run with a managed schema. Looking at the cwiki, "When a ||is not explicitly declared in a |solrconfig.xml|file, Solr implicitly uses a|ManagedIndexSchemaFactory|, which is by default|"muta

Re: Clustering on copy fields

2017-07-25 Thread Thomas Krebs
This is understood. My question is: I have a keep words filter on field2. field2 is used for clustering. Will the cluster algorithm use „some data“ or the result of the application of the keep words filter applied to „some data“. Cheers, Thomas > Am 26.07.2017 um 01:36 schrieb Erick Erickson

Re: Clustering on copy fields

2017-07-25 Thread Erick Erickson
copyFields are completely independent. The _raw_ data is passed to both. IOW, sending some data is equivalent to this with no copyfield some data some data Best, Erick On Tue, Jul 25, 2017 at 11:28 AM, Thomas Krebs wrote: > I have defined a copied field on which I would like to use clustering

Unable to create core [collection] Caused by: null

2017-07-25 Thread Lucas Pelegrino
Hey guys. Trying to make solr work here, but I'm getting this error from this command: $ ./solr create -c products -d /Users/lucaswxp/reduza-solr/products/conf/ Error CREATEing SolrCore 'products': Unable to create core [products] Caused by: null I'm posting my solrconf.xml, schema.xml and data

RE: Optimize stalls at the same point

2017-07-25 Thread Markus Jelsma
I agree, although we do have a NewRatio of two instead of three. One of our clusters takes between 600 to 800 queries per second per replica. Lowering it but just one got us much more performance. A note, the only cache is FilterCache and it has just a few dozen entries. -Original message-

Facet pivot response with comma

2017-07-25 Thread jcleary21
Hello, I'm using pivot faceting and in the response the pivot name has a comma. I'm deserializing this json response to an object in .net and I am having trouble with the comma. I'm wondering if there is a way to remove the comma from the response? I basically have a category/subcategory hierarchy

Re: Optimize stalls at the same point

2017-07-25 Thread David Hastings
Thanks a lot for the responses, after the optimize is complete and i have some time to experiment ill throw some of these settings in place, On Tue, Jul 25, 2017 at 4:39 PM, Walter Underwood wrote: > I’ve never been fond of elaborate GC settings. I prefer to set a few > things then let it run. I

Re: Optimize stalls at the same point

2017-07-25 Thread Walter Underwood
I’ve never been fond of elaborate GC settings. I prefer to set a few things then let it run. I know someone who’s worked on garbage collectors for thirty years. I don’t second guess him. From watching GC performance under a load benchmark (CMS/ParNew) with Solr 4.x, I increased the new space.

RE: Optimize stalls at the same point

2017-07-25 Thread Markus Jelsma
Upgrade to 6.x and get, in general, decent JVM settings. And decrease your heap, having it so extremely large is detrimental at best. Our shards can be 25 GB in size, but we run fine (apart from other problems recently discovered) with a 900 MB heap, so you probably have a lot of room to spare.

Re: Optimize stalls at the same point

2017-07-25 Thread David Hastings
it turned out that i think it was a large GC operation, as it has since resumed optimizing. current java options are as follows for the indexing server (they are different for the search servers) if you have any suggestions as to changes I am more than happy to hear them, honestly they have just b

Re: Optimize stalls at the same point

2017-07-25 Thread Walter Underwood
Are you sure you need a 100GB heap? The stall could be a major GC. We run with an 8GB heap. We also run with Xmx equal to Xms, growing memory to the max was really time-consuming after startup. What version of Java? What GC options? wunder Walter Underwood wun...@wunderwood.org http://observer.

Optimize stalls at the same point

2017-07-25 Thread David Hastings
I am trying to optimize a rather large index (417gb) because its sitting at 28% deletions. However when optimizing, it stops at exactly 492.24 GB every time. When I restart solr it will fall back down to 417 gb, and again, if i send an optimize command, the exact same 492.24 GB and it stops optim

Clustering on copy fields

2017-07-25 Thread Thomas Krebs
I have defined a copied field on which I would like to use clustering. I understood that the destination field will store the full content despite the filter chain I defined. Now, I have a keep word filter defined on the copied field. If I run clustering on the copied field will it use the resu

phrase highlight, exact phrases only?

2017-07-25 Thread Michael Joyner
Hello, We are using highlighting and are looking for the exact phrase "HIV Prevention" but are receiving back highlighted snippets like the following where non-phrase matching portions are being highlighted, is there a setting to highlight the entire phrase instead of any partial token match

Re: Sum of double fields in JSON Facet

2017-07-25 Thread Amrit Sarkar
Zheng, You may want to check https://issues.apache.org/jira/browse/SOLR-7452. I don't know whether they are absolutely related but I am sure I have seen complaints and enquiries regarding not precise statistics with JSON Facets. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucid

Re: index version - replicable versus searching

2017-07-25 Thread Erick Erickson
Ronald: Actually, people generally don't search on master ;). The idea is that master is configured for heavy indexing and then people search on the slaves which are configured for heavy query loads (e.g. memory, autowarming, whatever may be different). Which is it's own problem since the time the

Re: Json facet sort by subfacet

2017-07-25 Thread Susheel Kumar
Not that i am aware of to sort by numBuckets but how much difference it make if you sort by count. For e.g. below result is sorted by inner count and numBuckets in this example has same order. curl http://localhost:8983/solr/techproducts/query -d 'q=*:*&rows=0& json.facet={

Re: Need guidance solrcloud shardings with date interval

2017-07-25 Thread Erick Erickson
Slight typo: formerly called “composite ID routing” should read formerly called “implicit routing” On Tue, Jul 25, 2017 at 9:57 AM, Walter Underwood wrote: > Solr is not Oracle. Designs that might be great for Oracle can be terrible > for Solr. > > Solr really does not do this automatically, so

Re: Need guidance solrcloud shardings with date interval

2017-07-25 Thread Walter Underwood
Solr is not Oracle. Designs that might be great for Oracle can be terrible for Solr. Solr really does not do this automatically, so you won’t find that. If your job is to find that feature, you will fail. If your job is “find or write the feature”, you will be writing it. As I said before, you

RE: index version - replicable versus searching

2017-07-25 Thread Stanonik, Ronald
Bingo! Right on both counts! opensearcher was false. When I changed it to true, then I could see that master(searching) and master(replicable) both changed. And autocommit.maxtime is causing a commit on the master. Who uses master(replicable)? It seems for my simple master/slave configurati

Re: Copy field a source of copy field

2017-07-25 Thread tstusr
Je, I also think that!. We have some serious gaps on what you explain to me. First, you point me that there's no real need to use ShingleFilter, I tried with all Tokenizer and the result is the same, the species are not caught. On the simplest scenario I've got this: PUT YOUR FAVORI

Re: Lucene index corruption and recovery

2017-07-25 Thread sputul
Another sanity check. With deletion, only option would be to reindex those documents. Could someone please let me know if I am missing anything or if I am on track here. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-index-corruption-and-recovery-tp4347439p43

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Erick Erickson
When scores are identical, the tie is broken by the _internal_ Lucene doc ID. For the same doc on two different replicas of the same shard, the internal ID is not only different, but two docs may be ordered (by internal doc ID) one way on replica1 and reversed on replica2. To guarantee identical o

Json facet sort by subfacet

2017-07-25 Thread David Svånå
Hi, According to http://yonik.com/solr-facet-functions/, we can sort on "any facet function that appears in each bucket": $ curl http://localhost:8983/solr/query -d 'q=*:*& json.facet={ categories:{ type : terms, field : cat, sort : "x desc", // can also use sort:{x:desc}

Nutch 2.3 with Ms-SQL?

2017-07-25 Thread d.ku...@technisat.de
Hey, I know this is not quite the right place to ask for my nutch question. But, did one of you guy manage to use MS-SQL as GoraBackend for Nutch 2.3? As our the Website we are about to crawl, is not that big I would love to use MS-SQL. So far I haven't worked with hadoop, and in our company we

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Bernd Fehling
Am 25.07.2017 um 15:09 schrieb Mikhail Khludnev: > Since all scores are equal it just can not break this tie. Add id as a sort > clause to make results deterministic. What about setting statsCache to ExactSharedStatsCache? But there is all the same score 1.0, so maybe not a solution? Bernd > >

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Mikhail Khludnev
Since all scores are equal it just can not break this tie. Add id as a sort clause to make results deterministic. On Tue, Jul 25, 2017 at 3:39 PM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > Any wildcard query will do it, e.g. .../select?q=ant*&wt=json&... > > A couple of "shift + re

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Susheel Kumar
i thought you said different results i.e. different count. On Tue, Jul 25, 2017 at 8:39 AM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > Any wildcard query will do it, e.g. .../select?q=ant*&wt=json&... > > A couple of "shift + reload" (to bypass cache) in the browser and you > will s

Re: Sum of double fields in JSON Facet

2017-07-25 Thread Zheng Lin Edwin Yeo
This is the way which I put my JSON facet. totalAmount:"sum(sum(amount1_d,amount2_d))" amount1_d: 69446961.2 amount2_d: 0 Result I get: 69446959.27 Regards, Edwin On 25 July 2017 at 20:44, Zheng Lin Edwin Yeo wrote: > Hi, > > I'm trying to do a sum of two double fields in JSON Facet. One o

Sum of double fields in JSON Facet

2017-07-25 Thread Zheng Lin Edwin Yeo
Hi, I'm trying to do a sum of two double fields in JSON Facet. One of the field has a value of 69446961.2, while the other is 0. However, when I get the result, I'm getting a value of 69446959.27. This is 1.93 lesser than the original value. What could be the reason? I'm using Solr 6.5.1. Regar

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Bernd Fehling
Any wildcard query will do it, e.g. .../select?q=ant*&wt=json&... A couple of "shift + reload" (to bypass cache) in the browser and you will see that the order of the result changes sometimes. Definately no updates/ingestion because it's currently a SolrCloud test system with only 12 mio. docs.

Re: SolrCloud wildcard query result order change

2017-07-25 Thread Susheel Kumar
What is the query you are executing if you can share. Due you think difference could be due to updates/ingestion happening same time? Thanks, Susheel On Tue, Jul 25, 2017 at 7:47 AM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > With SolrCloud 6.4.2 (5 shards on 5 server) and a wildca

SolrCloud wildcard query result order change

2017-07-25 Thread Bernd Fehling
With SolrCloud 6.4.2 (5 shards on 5 server) and a wildcard query I get different results between the same query. I assume this is alltogether due to the distributed search and the response time of each server and the constant score of 1.0 ??? Is there any config where I can set the shard order (s

Unable to load Analysis page

2017-07-25 Thread Mishra, Kirtyanand
Hi Team, I am budding data scientist, I am trying to implement NER in solr, using UIMA, came across a project on the same by Tommaso, I am using solr 3.4, not able to achieve any NER but I am getting score for the query. The analysis page is also not loading, throwing a jsp not laod error. Pleas

Re: FreeTextSuggester throwing error "token must not contain separator byte"

2017-07-25 Thread Angel Todorov
Hi guys, Thank you very much for the help. I think I see what is going on. yes it is related to the Shingle filter added to the analyzer. It shouldn't be there if a FreeTextLookup factory is used in the suggester, because it creates conflict. The StandardTokenizer removes punctuation, including sp

Re: FreeTextSuggester throwing error "token must not contain separator byte"

2017-07-25 Thread alessandro.benedetti
I think this bit is the problem : "I am using a Shingle filter right after the StandardTokenizer, not sure if that has anything to do with it. " When using the FreeTextLookup approach, you don't need to use shingles in your analyser, shingles are added by the suggester itself. As Erick mentioned