How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Gili Nachum
I've implemented a custom solr2solr ongoing unidirectional replication mechanism. A Replicator (acting as solrJ client), crawls documents from SolrCloud1 and writes them to SolrCloud2 in batches. The replicator crawl logic is to read documents with a time greater/equale to the time of the last

Re: Does more shards in core improve performance?

2015-09-21 Thread Toke Eskildsen
On Mon, 2015-09-21 at 10:13 +0800, Zheng Lin Edwin Yeo wrote: > I didn't find any increase in indexing throughput by adding shards in the > same machine. > > However, I've managed to feed the index to Solr from more than one thread > at a time. It can take up to 3 threads without affecting the

Re: ctargett commented on http://people.apache.org/~ctargett/RefGuidePOC/current/Index-Replication.html

2015-09-21 Thread Steve Rowe
I logged into comments.a.o and then disabled emailing of comments to this list. When we set up the "solrcwiki" site on comments.apache.org, the requirement was that the PMC chair be the (sole) manager, and though I am no longer chair, I'm still the manager of the "solrcwiki" site for the ASF

Re: solr auggestion with copy-field

2015-09-21 Thread Alessandro Benedetti
There are always some steps to take care of in these situations : 1) have you checked that your destination copied field is fine ? Is it containing what you expect ? have you investigated the indexed terms ? 2) Have you built your suggester ? It doesn't build on startup or onCommit ( reading

Re: Solr4.7: tlog replay has a major delay before start recovering transaction replay

2015-09-21 Thread Shalin Shekhar Mangar
Hi Jeff, Comments inline: On Mon, Sep 21, 2015 at 6:06 PM, Jeff Wu wrote: > Our environment ran in Solr4.7. Recently hit a core recovery failure and > then it retries to recover from tlog. > > We noticed after 20:05:22 said Recovery failed, Solr server waited a long > time

faceting is unusable slow since upgrade to 5.3.0

2015-09-21 Thread Uwe Reh
Hi, our bibliographic index (~20M entries) runs fine with Solr 4.10.3 With Solr 5.3 faceted searching is constantly incredibly slow (~ 20 seconds) Output of 'debugQuery': 17705.0 2.0 17590.0 !! 111.0 The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr 5.3. In Solr

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
Hi Shalin, thank you for the response. We waited longer enough than the ZK session timeout time, and it still did not kick off any leader election for these "remained down-leader" cores. That's the question I'm actually asking. Our test scenario: Each solr server has 64 cores, and they are all

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon
Sure. thank you Upayavira With Regards Aman Tandon On Mon, Sep 21, 2015 at 6:01 PM, Upayavira wrote: > You cannot do multi valued fields with LatLongType fields. Therefore, if > that is a need, you will have to investigate RPT fields. > > I'm not sure how you do distance

solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
Our environment still run with Solr4.7. Recently we noticed in a test. When we stopped 1 solr server(solr02, which did OS shutdown), all the cores of solr02 are shown as "down", but remains a few cores still as leaders. After that, we quickly seeing all other servers are still sending requests to

Re: Solr4.7: tlog replay has a major delay before start recovering transaction replay

2015-09-21 Thread Jeff Wu
> > Before tlog replay, the replica will replicate any missing index files > from the leader. I think that is what is causing the time between the > two log messages. You have INFO logging turned off so there are no > messages from the replication handler about it. I did not monitor major

solr auggestion with copy-field

2015-09-21 Thread sara hajili
hi all i wanna to get suggestion from multi field in solr. i add this to solrConfig mySuggester FuzzyLookupFactory DocumentDictionaryFactory suggestStr like_count string false true 10 suggest and add this to schema: and but i didn't get any result from this suggest query:

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Upayavira
You cannot do multi valued fields with LatLongType fields. Therefore, if that is a need, you will have to investigate RPT fields. I'm not sure how you do distance boosting there, so I'd suggest you ask that as a separate question with a new title. Upayavira On Mon, Sep 21, 2015, at 01:27 PM,

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-21 Thread Uwe Reh
Am 21.09.2015 um 15:16 schrieb Shalin Shekhar Mangar: Can you post your complete facet request as well as the schema definition of the field on which you are faceting? Query:

Re: modular QueryParser in contrib

2015-09-21 Thread Jack Krupansky
Probably a reference to the so-called flex query parser: https://lucene.apache.org/core/4_10_0/queryparser/org/apache/lucene/queryparser/flexible/standard/StandardQueryParser.html Read:

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon
We are using LatLonType to use the gradual boosting / distance based boosting of search results. With Regards Aman Tandon On Mon, Sep 21, 2015 at 5:39 PM, Upayavira wrote: > Aman, > > I cannot promise to answer questions promptly - like most people on this > list, we answer

Re: Zero Query results

2015-09-21 Thread Mark Fenbers
Ok, Erick, you provided useful info to help with my understanding. However, I still get zero results when I search on literal text (e.g., "Wednesday"), even with making changes that you suggest. However, I discovered that if I search on "Wednesday*" (trailing asterisk), then I get all the

Solr4.7: tlog replay has a major delay before start recovering transaction replay

2015-09-21 Thread Jeff Wu
Our environment ran in Solr4.7. Recently hit a core recovery failure and then it retries to recover from tlog. We noticed after 20:05:22 said Recovery failed, Solr server waited a long time before it started tlog replay. During that time, we have about 32 cores doing such tlog relay. The service

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Shalin Shekhar Mangar
Hi Jeff, The leader election relies on ephemeral nodes in Zookeeper to detect when leader or other nodes have gone down (abruptly). These ephemeral nodes are automatically deleted by ZooKeeper after the ZK session timeout which is by default 30 seconds. So if you kill a node then it can take up

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-21 Thread Shalin Shekhar Mangar
Can you post your complete facet request as well as the schema definition of the field on which you are faceting? On Mon, Sep 21, 2015 at 5:39 PM, Uwe Reh wrote: > Hi, > > our bibliographic index (~20M entries) runs fine with Solr 4.10.3 > With Solr 5.3 faceted

modular QueryParser in contrib

2015-09-21 Thread Dmitry Kan
Hello! Asked the question on IRC, mirroring it here too: In lucene level QP there is a comment https://github.com/apache/lucene-solr/blob/lucene_solr_4_10/lucene/queryparser/src/java/org/apache/lucene/queryparser/classic/QueryParser.jj#L99 pointing to some contrib query parser, that offers

Spatial Search: distance based boosting

2015-09-21 Thread Aman Tandon
Hi, Is there a way in solr to do the distance based boosting using Spatial RPT field? With Regards Aman Tandon

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-21 Thread Joel Bernstein
Have you looked at your Solr instance with a cpu profiler like YourKit? It would be useful to see the hotspots which should be really obvious with 20 second response times. Also are you running in distributed mode or on a single Solr instance? Joel Bernstein http://joelsolr.blogspot.com/ On

Re: SolrCloud Startup question

2015-09-21 Thread Ravi Solr
Thank you Anshum & Upayavira. BTW do any of you guys know if CloudSolrClient is ThreadSafe ?? Thanks, Ravi Kiran Bhaskar On Monday, September 21, 2015, Anshum Gupta wrote: > Hi Ravi, > > I just tried it out and here's my understanding: > > 1. Starting Solr with -c

Re: SolrCloud Startup question

2015-09-21 Thread Anshum Gupta
CloudSolrClient is thread safe and it is highly recommended you reuse the client. If you are providing an HttpClient instance while constructing, make sure that the HttpClient uses a multi-threaded connection manager. On Mon, Sep 21, 2015 at 3:13 PM, Ravi Solr wrote: >

Re: modular QueryParser in contrib

2015-09-21 Thread Jack Krupansky
Oops, sorry for the very old source code links, although nothing much changed in the current release: http://lucene.apache.org/core/5_3_0/queryparser/org/apache/lucene/queryparser/flexible/standard/package-summary.html

Re: How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Shawn Heisey
On 9/21/2015 3:09 AM, Upayavira wrote: > Effectively, all it does is return the value of NOW according to the > request, as the default value. > > You could construct that on a per invocation basis, using > System.getMillis() or whatever. The millisecond timestamp isn't guaranteed to always

Pivot facets

2015-09-21 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
Hi Can someone suggest or any workaround for my issue with Pivot facets? Use case: I have a collection with 5 Levels of fields, In some documents the data won't be there for Five Levels (fields) and I am not indexing the columns (No column for that doc). In some search results the Pivot

Re: How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Shawn Heisey
On 9/21/2015 9:01 AM, Gili Nachum wrote: > TimestampUpdateProcessorFactory takes place only on the leader shard, or on > each shard replica? > if on each replica then I would get different values on each replica. > > My alternative would be to perform secondary sort on a UUID to ensure order. If

Re: Ideas

2015-09-21 Thread Paul Libbrecht
Writing a query component would be pretty easy or? It would throw an exception if crazy numbers are requested... I can provide a simple example of a maven project for a query component. Paul William Bell wrote: > We have some Denial of service attacks on our web site. SOLR threads are > going

Re: Ideas

2015-09-21 Thread DVT
Hi Bill, the classical way would be to have a reverse proxy in front of the application that catches such cases. A decent reverse proxy or even application firewall router will allow you to define limits on bandwidth and sessions per time unit. Some even recognize specific denial-of-service

Re: Ideas

2015-09-21 Thread Doug Turnbull
The nginx reverse proxy we use blocks ridicilous start and rows values https://github.com/o19s/solr_nginx Another silly thing I've noticed is you can pass sleep() as a function query. It's not documented, but I think a big hole. I wonder if I could DoS your Solr by sleeping and hogging all the

Ideas

2015-09-21 Thread William Bell
We have some Denial of service attacks on our web site. SOLR threads are going crazy. Basically someone is hitting start=15 + and rows=20. The start is crazy large. And then they jump around. start=15 then start=213030 etc. Any ideas for how to stop this besides blocking these IPs?

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
Hi Shai, still the same question: other peer cores which they are active did not claim to be leader after a long time. However, some of the peer cores claimed to be leaders at earlier time when server stopping. That's inconsistent results 2015-09-21 10:52 GMT-04:00 Shai Erera :

Re: Ideas

2015-09-21 Thread Walter Underwood
I have put a limit in the front end at a couple of sites. Nobody gets more than 50 pages of results. Show page 50 if they request beyond that. First got hit by this at Netflix, years ago. Solr 4 is much better about deep paging, but here at Chegg we got deep paging plus a stupid, long query.

Re: Zero Query results

2015-09-21 Thread Erick Erickson
bq: However, I discovered that if I search on "Wednesday*" (trailing asterisk), then I get all the results containing Wednesday that I'm looking for! This almost always means you're not searching on the field you think you're searching on and/or the field isn't being analyzed as you think (i.e.

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Upayavira
Aman, I cannot promise to answer questions promptly - like most people on this list, we answer if/when we have a gap in our workload. The reason you are getting the non multiValued field error is because your latlon field does not have multiValued="true" enabled. However, the field type

Re: Zero Query results

2015-09-21 Thread Mark Fenbers
You were right about finding only the Wednesday occurrences at the beginning of the line. But attached (if it works) is a screen capture of my admin UI. But unlike your suspicion, the index text is being parsed properly, it appears. So I'm uncertain where this leads me. Also attached is

Re: How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Gili Nachum
TimestampUpdateProcessorFactory takes place only on the leader shard, or on each shard replica? if on each replica then I would get different values on each replica. My alternative would be to perform secondary sort on a UUID to ensure order. Thanks. On Mon, Sep 21, 2015 at 12:09 PM, Upayavira

Re: write.lock

2015-09-21 Thread Mark Fenbers
A snippet of my solrconfig.xml is attached. The snippet only contains the Spell checking sections (for brevity) which should be sufficient for you to see all the pertinent info you seek. Thanks! Mark On 9/19/2015 3:29 AM, Mikhail Khludnev wrote: Mark, What's your solconfig.xml? On Sat,

Re: modular QueryParser in contrib

2015-09-21 Thread Dmitry Kan
Thanks for the valuable links Jack. Dmitry On Mon, Sep 21, 2015 at 5:09 PM, Jack Krupansky wrote: > Oops, sorry for the very old source code links, although nothing much > changed in the current release: > >

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Shai Erera
I don't think the process Shalin describes applies to clusterstate.json. That JSON object reflects the status Solr "knows" about, or "last known status". When Solr is properly shutdown, I believe those attributes are cleared from clusterstate.json, as well the leaders give up their lease.

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon
Upayavira, please help With Regards Aman Tandon On Mon, Sep 21, 2015 at 2:38 PM, Aman Tandon wrote: > Error is > > > > 400 name="QTime">28ERROR: > [doc=9474144846] multiple values encountered for non multiValued field > latlon_0_coordinate: [11.0183, 11.0183]

Re: SolrCloud Startup question

2015-09-21 Thread Anshum Gupta
Hi Ravi, I just tried it out and here's my understanding: 1. Starting Solr with -c starts Solr in cloud mode. This is used to start Solr with an embedded zookeeper. 2. Starting Solr with -z starts Solr in cloud mode, with the zk connection string you specify. You don't need to explicitly specify

SolrCloud Startup question

2015-09-21 Thread Ravi Solr
Can somebody kindly help me understand the difference between the following startup calls ? ./solr start -p -s /solr/home -z zk1:2181,zk2:2181,zk3:2181 Vs ./solr start -c -p -s /solr/home -z zk1:2181,zk2:2181,zk3:2181 What happens if i don't pass the "-c" option ?? I read the

Re: SolrCloud Startup question

2015-09-21 Thread Upayavira
As it says below, -c enables a Zookeeper node within the same JVM as Solr. You don't want that, as you already have an ensemble up and running. Upayavira On Mon, Sep 21, 2015, at 09:35 PM, Ravi Solr wrote: > Can somebody kindly help me understand the difference between the > following > startup

Re: FieldCache error for multivalued fields in json facets.

2015-09-21 Thread Vishnu Mishra
Hi I am using solr 5.3 and I have the same problem while doing json facet on multivalued field. Below is the error stack trace : 2015-09-21 21:26:09,292 ERROR org.apache.solr.core.SolrCore ? org.apache.solr.common.SolrException: can not use FieldCache on multivalued field: FLAG at

Re: How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Gili Nachum
Thanks for the indepth explanation! The secondary sort by uuid would allow me to read a series of docs with identical time over multiple batches by specifying filtering time>timeOnLastReadDoc or (time=timeOnLastReadDoc and uuid>uuidOnLastReaDoc) which essentially creates a unique sorted value to

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Gili Nachum
Happens to us too. Solr 4.7.2 On Sep 21, 2015 20:42, "Jeff Wu" wrote: > Hi Shai, still the same question: other peer cores which they are active > did not claim to be leader after a long time. However, some of the peer > cores claimed to be leaders at earlier time when

Re: write.lock

2015-09-21 Thread Mikhail Khludnev
Both of these guys below try to write spell index into the same dir. Don't they? To make it clear, it's not possible so far. solr.IndexBasedSpellChecker /localapps/dev/EventLog/index solr.FileBasedSpellChecker /localapps/dev/EventLog/index Also, can you

Re: ctargett commented on http://people.apache.org/~ctargett/RefGuidePOC/current/Index-Replication.html

2015-09-21 Thread Cassandra Targett
Hey folks, I'm doing some experiments with other formats for the Ref Guide and playing around with options for comments. I didn't realize this old experiment from https://issues.apache.org/jira/browse/SOLR-4889 would send email - I'm talking to Steve Rowe to see if we can get that disabled.

Re: Zero Query results

2015-09-21 Thread Erick Erickson
Screen captures generally get filtered out by the Apache e-mail, it didn't come through. But this makes no sense. The text_en field type you pasted should not be having the problems you're talking about. So if you add debug=true, you should be seeing your "Wednesday" query going against your

ctargett commented on http://people.apache.org/~ctargett/RefGuidePOC/current/Index-Replication.html

2015-09-21 Thread no-reply
Hello, ctargett has commented on http://people.apache.org/~ctargett/RefGuidePOC/current/Index-Replication.html. You can find the comment here: http://people.apache.org/~ctargett/RefGuidePOC/current/Index-Replication.html#comment_4535 Please note that if the comment contains a

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon
Hi Erick, I am getting the same error because my dynamic field *_coordinate is stored="true". How can I get rid of this error? And I have to use the atomic update. Please help!! With Regards Aman Tandon On Tue, Aug 5, 2014 at 10:27 PM, Franco Giacosa wrote: > Hey Erick, i

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon
Error is 40028ERROR: [doc=9474144846] multiple values encountered for non multiValued field latlon_0_coordinate: [11.0183, 11.0183]400 And my configuration is how you know it is because of stored="true"? As Erick replied in the last mail thread, I'm not getting any

Re: Does more shards in core improve performance?

2015-09-21 Thread Zheng Lin Edwin Yeo
I'm not sure if that is because currently my machine is a normal PC and not a server, but my CPU specification for each of the core is Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz. It should probably be better when the real server which has a much better specification comes, and I should be able to

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Upayavira
Can you show the error you are getting, and how you know it is because of stored="true"? Upayavira On Mon, Sep 21, 2015, at 09:30 AM, Aman Tandon wrote: > Hi Erick, > > I am getting the same error because my dynamic field *_coordinate is > stored="true". > How can I get rid of this error? > >

Re: Questions regarding indexing JSON data

2015-09-21 Thread Upayavira
On Mon, Sep 21, 2015, at 02:53 AM, Kevin Vasko wrote: > I am new to Apache Solr and have been struggling with indexing some JSON > files. > > I have several TB of twitter data in JSON format that I am having trouble > posting/indexing. I am trying to use a schemaless schema so I don't have > to

Re: How can I get a monotonically increasing field value for docs?

2015-09-21 Thread Upayavira
There's nothing to stop you creating your own TimestampUpdateProcessorFactory, here's the entire source for it: public class TimestampUpdateProcessorFactory extends AbstractDefaultValueUpdateProcessorFactory { @Override public UpdateRequestProcessor getInstance(SolrQueryRequest req,