Re: cloud disk space utilization

2018-08-29 Thread Shalin Shekhar Mangar
There is a bad oversight on our part which causes preferences to not be used for placing replicas unless a cluster policy also exists. We hope to fix it in the next release (Solr 7.5). See https://issues.apache.org/jira/browse/SOLR-12648 You may also be interested in

Atomic updates and POST command?

2018-08-29 Thread Scott Prentice
Hi... I'm trying to get atomic updates working and am seeing some strangeness. Here's my JSON with the data to update .. [{"id":"/unique/path/id",   "field1":{"set","newvalue1"},   "field2":{"set","newvalue2"} }] If I use the REST API via curl it works fine. With the following command, the

Re: Boost matches occurring early in the field (offset)

2018-08-29 Thread Alexandre Rafalovitch
TokenOffsetPayloadTokenFilter ? It is mentioned in https://www.slideshare.net/lucidworks/payloads-in-solr-erik-hatcher-lucidworks , but no detailed example seems to be given. I do see this question from time to time, so a definitive feedback would be useful for the future. Regards, Alex. On

RE: Boost matches occurring early in the field (offset)

2018-08-29 Thread Markus Jelsma
Hello Jan, Many years ago i made an extension of SpanFirstQuery called GradientSpanFirstQuery that did just that, decrease the boost for each advanced position in the text. Then Lucene 4 or 5 came and this code wouldn't compile any more. @Override protected AcceptStatus

Re: cloud disk space utilization

2018-08-29 Thread Walter Underwood
You need free disk space equal to at least half the minimum sizes of the collections. You might need more. We have a 23 GB collection in Solr cloud. When we reload all the content and wait until the end to do a commit, it gets up to 51 GB. wunder Walter Underwood wun...@wunderwood.org

Re: cloud disk space utilization

2018-08-29 Thread Kudrettin Güleryüz
Given the set of preferences above, I would expect the difference between the largest freedisk (test-43 currently) and the smallest freedisk (test-45 currently) to be smaller than what is below. Below is the output from reading diagnostics endpoint from autoscaling API. According this output, the

Re: Boost matches occurring early in the field (offset)

2018-08-29 Thread Jan Høydahl
I also tend to use "sentinel tokens" for exact match or to anchor a search. But in order to obtain decaying boost the further down in the article a match is, you'd need to write several such span/slop queries with varying slops, e.g. highest boost for first 10 words, medium boost for first 50

Re: Solr7.4 core node and shard replica numbering question

2018-08-29 Thread Shawn Heisey
On 8/29/2018 6:00 AM, Synet Spambox wrote: The core numbers and the replica numbers are up to 10 (double the amount of servers)? It is confusing for me... Does it matter? Do I miss something ? Something changed somewhere regarding core naming in SolrCloud, and it doesn't behave in the same

Re: SolrCore Initialization Failure Error loading class 'solr.IntField'

2018-08-29 Thread Shawn Heisey
On 8/29/2018 1:27 AM, Salvo Bonanno wrote: [error] corename: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core corename: Can't load schema /opt/solr/server/solr/corename/conf/managed-schema: Plugin init failure for [schema.xml] fieldType

Re: Atomic Update Failure With solr.UUID Field

2018-08-29 Thread Stephen Lewis Bianamara
Hi All, Just checking back in. Did anyone have a chance to take a look? Would love to get some help here. My design requires docs with many UUIDs which should not need to be updated each time and should be optimally performant for filters. So I think this bug is currently a hard blocker for me to

Re: SolrCore Initialization Failure Error loading class 'solr.IntField'

2018-08-29 Thread Erick Erickson
What versions of Solr? Point fields were introduced in around Solr 6.2. It looks like your new server is running some earlier version of Solr perhaps? Best, Erick On Wed, Aug 29, 2018 at 12:27 AM Salvo Bonanno wrote: > > Hello Everyone! > > I have a strange problem, I'm trying to move an

Re: Migrate 4 Shards/0 Replica to 1 Shard/1 Replica

2018-08-29 Thread Shawn Heisey
On 8/29/2018 7:17 AM, Pure Host - Wolfgang Freudenberger wrote: I am currently restructuring a big-data cloud with 1000+ collections on a SOLRCloud. The datas are stored on 4 shards without a replica. This data are deprecated and readonly for some purpose, so I want to migrate them to a new

Re: Boost matches occurring early in the field (offset)

2018-08-29 Thread Doug Turnbull
You can also insert a token at the beginning of the query during analysis using a char filter. I call these sort of boundary tokens "sentinel tokens". So a phrase search for "red shoes" becomes " red shoes". You can add some slop to allow for permissible distance (with You can also use the Limit

Re: Boost matches occurring early in the field (offset)

2018-08-29 Thread Jan Høydahl
I have seen that one. But as I understand spanFirst, it only allows you to define a boost if your span matches, i.e. not a gradually lower score the further down in the document the match is? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 29. aug. 2018 kl. 12:26

Re: Solr indexing Duplicate URL's ending with /

2018-08-29 Thread Jan Høydahl
Hi, You would have to direct this question to the crawler you are using, since it is the crawler that decides the document ID to send to Solr. Most crawlers will have configuration options to normalize the URL for each document. However you could also try to clean the URL after it arrives in

Migrate 4 Shards/0 Replica to 1 Shard/1 Replica

2018-08-29 Thread Pure Host - Wolfgang Freudenberger
Hi Guys, I am currently restructuring a big-data cloud with 1000+ collections on a SOLRCloud. The datas are stored on 4 shards without a replica. This data are deprecated and readonly for some purpose, so I want to migrate them to a new cloud with 1 Shard and 1 Replica. Is there an "easy"

Solr indexing Duplicate URL's ending with /

2018-08-29 Thread kunhu0...@gmail.com
Team, Need suggestion on how to remove the duplicate entries while indexing to Solr. Below are the sample entries i see in solr collection while i need to remove the one which is ending with / https://www.abc.com/2018/test.html https://www.abc.com/2018/test.html/ Thank you -- Sent from:

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Alfonso Noriega
Thanks for your time and help Andrea, I guess we should try to use the Json API provided by Solr and figure out a way to do so with SolrJ. On Wed, 29 Aug 2018 at 14:21, Andrea Gazzarini wrote: > Well, I don't know the actual

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Andrea Gazzarini
Well, I don't know the actual reason why the behavior is different between Cloud and Embedded client: maybe things are different because in the Embedded Solr HTTP is not involved at all, but I'm just shooting in the dark. I'm not aware about POST capabilities you mentioned, sorry Andrea On

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Alfonso Noriega
Yes, I realized that changing the method to GET solves the issue but it is intentionally set to POST as in a real case scenario we had the issue of users creating too long queries which where facing the REST length limits. I think a possible solution would be to send the Solr params as a json in

Solr7.4 core node and shard replica numbering question

2018-08-29 Thread Synet Spambox
Hello List! In the past I did a view installations of Solr5 and Solr6. So im am uses to install solr The question is now, I have installed Solr7.4 on 5 Hosts with zookeeper ensemble (5 Servers). But I am confused about the replica core numbering: The naming of the cores are up to core_node10

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Andrea Gazzarini
I think that's the issue: just guessing because I do not have the code in front of me. POST requests put the query in the request body, and the EmbeddedSolrServer expects to find a valid JSON. Did you try to remove the Method param? Andrea On 29/08/2018 13:12, Alfonso Noriega wrote: Hi

Rectangle with rotation in Solr

2018-08-29 Thread Zahra Aminolroaya
I have locations with 4-tuple (longitude,latitude) which are like rectangles and I want to index them. Solr BBoxField with minX, maxX, maxY and minY, only considers rectangles which does not have rotations. suppose my rectangle is rotated 45 degree clockwise based on axis, how can I define

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Alfonso Noriega
Hi Andrea, Thanks for your help, something which is relevant and I forgot to mention before is that the requests are always a POST method. As mentioned before, it is not a single query which fails but all of the requests done to the search handler. final SolrQuery q = new SolrQuery("!( _id_:"+

Re: Solr7 embeded req: Bad content type error

2018-08-29 Thread Andrea Gazzarini
Hi Alfonso, could you please paste an extract of the client code? Specifically those few lines where you create the SolrQuery with params. The line you mentioned is dealing with ContentStream which as far as I remember wraps the request body, and not the request params. So as request body

Solr7 embeded req: Bad content type error

2018-08-29 Thread Alfonso Noriega
Hi, I am implementing a migration of Vind library from solr 5 to 7.4.0 and I am facing an error which I have no idea how to solve... The library provides a wrapper (and some extra stuff) to develop search tools over Solr and uses SolrJ to access it, more

RE: 7.4.0 SQL handler throws exception if WHERE clause is present

2018-08-29 Thread Markus Jelsma
Hi, Forget about it, after ten years without SQL, i managed to forget i had to wrap the WHERE value in quotes, single quotes in this case. Thanks, Markus -Original message- > From:Markus Jelsma > Sent: Wednesday 29th August 2018 11:51 > To: solr-user > Subject: 7.4.0 SQL handler

Re: Boost matches occurring early in the field (offset)

2018-08-29 Thread Mikhail Khludnev
On Wed, Aug 29, 2018 at 1:19 PM Jan Høydahl wrote: > Hi, > > Is there an ootb way to boost term matches based on their position/offset > inside a field, so that the term gets a higher score if it occurs

Boost matches occurring early in the field (offset)

2018-08-29 Thread Jan Høydahl
Hi, Is there an ootb way to boost term matches based on their position/offset inside a field, so that the term gets a higher score if it occurs in the befinning of the field and lower boost or a deboost if it occurs towards the end of a field? I know that I could index the first part of the

7.4.0 SQL handler throws exception if WHERE clause is present

2018-08-29 Thread Markus Jelsma
Hello, I was, finally, trying the SQL handler on one of our collections. Executing a SELECT * FROM logs LIMIT 10 runs fine, but restricting the set using a WHERE clause gives me the exception below. The type field is a String type, indexed and has DocValues. I must be doing something wrong,

Re: “solr.data.dir” can only config a single directory

2018-08-29 Thread zhenyuan wei
Oh ~ my fault!Sorry for that, I should say somebody,like me~ Bram Van Dam 于2018年8月29日周三 下午3:28写道: > On 28/08/18 08:03, zhenyuan wei wrote: > > But this is not a common way to do so, I mean, nobody want to ADDREPLICA > > after collection was created. > > I wouldn't say "nobody".. >

Re: “solr.data.dir” can only config a single directory

2018-08-29 Thread Bram Van Dam
On 28/08/18 08:03, zhenyuan wei wrote: > But this is not a common way to do so, I mean, nobody want to ADDREPLICA > after collection was created. I wouldn't say "nobody"..

SolrCore Initialization Failure Error loading class 'solr.IntField'

2018-08-29 Thread Salvo Bonanno
Hello Everyone! I have a strange problem, I'm trying to move an existing and working configuration of Apache Solr from a server to another one. The configuration is really simple since it's still incomplete, i've just create the core and create a managed-schema. The core is running flawlessy on

Re: “solr.data.dir” can only config a single directory

2018-08-29 Thread zhenyuan wei
Pretty cool,here creates an issue to put this discussion into practice. issues: https://issues.apache.org/jira/browse/SOLR-12713 Best, TinsWzy Erick Erickson 于2018年8月28日周二 下午11:51写道: > Patches welcome. > > On Mon, Aug 27, 2018, 23:03 zhenyuan wei wrote: > > > But this is not a common way to

missing jmx stats for num_docs and max_doc

2018-08-29 Thread Zehua Liu
Hi, We are running a 7.4.0 solr cluster with 3 tlogs and a few pulls. There is one collection divided into 8 shards, with each tlog has all 8 shards, and each pull either has shard1 to 4 or shard5 to 8. When using jmx to collect num_docs metrics via datadog, we found that the metrics for some