Re: edismax - Regex query.

2016-06-28 Thread Modassar Ather
Hi, Any input will be really helpful. Regards, Modassar On Tue, Jun 28, 2016 at 9:30 AM, Modassar Ather wrote: > Kindly provide your inputs. > > Thanks, > Modassar > > On Mon, Jun 27, 2016 at 4:11 PM, Modassar Ather > wrote: > >> Hi, >> >> I have a qf defined as follows: >> >> *fl1 fl2 fl3 fl

Re: How to speed up field collapsing on large number of groups

2016-06-28 Thread Jichi Guo
Thanks for the quick response, Joel! I am hoping to delay sharding if possible, which might involve more things to consider :) 1) What is the size of the result set before the collapse? When search with q=*:* for example, before collapse numFound is around 5 million, and that after col

Re: How to speed up field collapsing on large number of groups

2016-06-28 Thread Joel Bernstein
Sharding will help, but you'll need to co-locate documents by group ID. A few questions / suggestions: 1) What is the size of the result set before the collapse? 2) Have you tested without the long formula, just using a field for the min/max. It would be good to understand the impact of the formul

Re: Help with recovering shard range after zookeeper disaster

2016-06-28 Thread pramodEbay
Thanks for the reply. Since, we don't have a working snapshot - we are creating brand new zookeeper nodes, re-upload solr configurations and manually create a clusterstate.json. Fortunately, doing a combination of grep and awk on corrupt snapshot - we figured out what the shard ranges were each of

Re: Help with recovering shard range after zookeeper disaster

2016-06-28 Thread Jeff Wartes
This might come a little late to be helpful, but I had a similar situation with Solr 5.4 once. We ended up finding a ZK snapshot we could restore, but we did also get the cluster back up for most of the interim by taking the now-empty ZK cluster, re-uploading the configs that the collections us

Re: json facet - date range & interval

2016-06-28 Thread Jay Potharaju
that worked ...thanks David! On Tue, Jun 28, 2016 at 11:22 AM, David Santamauro < david.santama...@gmail.com> wrote: > > Have you tried %-escaping? > > json.facet = { > daterange : { type : range, > field : datefield, > start : "NOW/DAY%2D10DAYS", >

How to speed up field collapsing on large number of groups

2016-06-28 Thread jichi
Hi everyone, I am using Solr 4.10 to index 20 million documents without sharding. Each document has a groupId field, and there are about 2 million groups. I found the search with collapsing on groupId significantly slower comparing to without collapsing, especially when combined with facet queries

Re: json facet - date range & interval

2016-06-28 Thread David Santamauro
Have you tried %-escaping? json.facet = { daterange : { type : range, field : datefield, start : "NOW/DAY%2D10DAYS", end : "NOW/DAY", gap : "%2B1DAY" } } On 06/28/2016 01:19 PM, Jay Potharaju wrote: json.facet={daterange

Re: json facet - date range & interval

2016-06-28 Thread Jay Potharaju
json.facet={daterange : {type : range, field : datefield, start : "NOW/DAY-10DAYS", end : "NOW/DAY",gap:"\+1DAY"} } Escaping the plus sign also gives the same error. Any other suggestions how can i make this work? Thanks Jay On Mon, Jun 27, 2016 at 10:23 PM, Erick Erickson wrote: > First thing

IO Exception : Truncated chunk for WORKER collection for paraller stream Join Query

2016-06-28 Thread Roshan Kamble
Hello, we are using Solr 6.0.0 in CloudMode with 3 physical nodes and 3 shards per collection. we are using ParallelStream for our join searches. Below error is observed when while searching with join query. java.util.concurrent.ExecutionException: java.io.IOException: --> http://XX:XX:XX:

RE: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many bytes written

2016-06-28 Thread Rajendran, Prabaharan
Thanks Toke, now I am splitting file before indexing. Shalin, thanks for the details. Even this fixed in 5.5 and 6.0 is there any threshold value. Please suggest me which is best way to index(multithreaded) if your input format is text/csv (file). Thanks, Prabaharan -Original Message

RE: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many bytes written

2016-06-28 Thread Rajendran, Prabaharan
Thanks Erick, for your response. Now I am splitting the file before indexing. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: 28 June 2016 11:01 To: solr-user Subject: Re: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many

Re: No live SolrServers triggered by maxclausecount

2016-06-28 Thread Pablo Anzorena
Thanks! I will analyze it and let you know. 2016-06-28 11:25 GMT-03:00 Erick Erickson : > OK, but the consider what you can do to keep from having to > create such things. > 1> for infix notation (leading and trailing wildcards) you can > use ngrams to turn them into simple queries. These ar

Re: Solr Post Filter: Mixing frange and Boolean Queries

2016-06-28 Thread Erick Erickson
I don't quite know whether it's a typo, but the query you pasted doesn't have an ampersand between the clauses. I'd peel this back and build up my fq clause one bit at a time, I suspect you have a typo, unbalanced parens etc., assuming it isn't just the ampersand. Best, Erick On Tue, Jun 28, 201

Re: Problem starting solr

2016-06-28 Thread Erick Erickson
This is still a pain. Either the script should be more robust (it's a pretty long, but it shouldn't be too hard ot find the 30 second wait that you can increase) or try to figure out why it takes 30 seconds for your Solr to start. the "-f" option can help you watch things go by and perhaps give yo

Re: limit stored field size

2016-06-28 Thread Erick Erickson
see StandardHighlighter here: https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter Best, Erick On Tue, Jun 28, 2016 at 1:22 AM, Avi Steiner wrote: > As far as I remember it didn't work. I used DefaultSolrHighlighter (because > the improved ones as PostingsSolrHighlighter requir

Re: Positions files analysis

2016-06-28 Thread Erick Erickson
yeah, Luke is the way to go. If you're patient the admin UI>>schema browser, pick a field and hit the "load term info" button. You'll see some terms and, in light gray the total number of terms in your index for that replica. Since this is a text field, the TermsComponent can also help. Basically

Re: Problem starting solr

2016-06-28 Thread Mary Anne Smart
It didn't even occur to me that Solr might be running anyway; looks like it is. Thanks, Mary Anne On Tue, Jun 28, 2016 at 10:50 AM, Erick Erickson wrote: > Hmm, that log file isn't very useful, this is puzzling. Is Solr running > anyway? Can you connect to localhost:2181/solr? > > Are you sure

Re: Solr PhraseQuery With Wildcard

2016-06-28 Thread Erick Erickson
There certainly is a lot to learn! Right, the only problem I have with your analysis chain is that the WhitespaceTokenizer doesn't strip punctuation so you'll have terms like "texto." (note the period). Something like PatternReplaceFilterFactory would help here. Best, Erick On Tue, Jun 28, 2016

Re: Problem starting solr

2016-06-28 Thread Erick Erickson
Hmm, that log file isn't very useful, this is puzzling. Is Solr running anyway? Can you connect to localhost:2181/solr? Are you sure you're pasting the correct Solr log BTW? Best, Erick On Tue, Jun 28, 2016 at 6:32 AM, Mary Anne Smart wrote: > I've downloaded solr and updated my java version to

Re: Beginer's questions

2016-06-28 Thread Erick Erickson
1> When running in SolrCloud mode the configs live on Zookeeper. When running locally the configs live on the local disk. The reason for this is that in Cloud mode, you might have replicas scattered all over the cluster on many machines (I've seen 1,000+ replicas in toto). You wouldn't want to have

CDCR (Solr6.x) does not start

2016-06-28 Thread Uwe Reh
Hi, I'm trying to get CDCR to run, but I can't even trigger any communication between SOURCE and TARGET. It seems to be a small but grave misunderstanding. I've tested a lot of variants but now I'm blind on this point. If anyone could give me a hint, I would appreciate. Uwe Testsetting: Two

Re: No live SolrServers triggered by maxclausecount

2016-06-28 Thread Erick Erickson
OK, but the consider what you can do to keep from having to create such things. 1> for infix notation (leading and trailing wildcards) you can use ngrams to turn them into simple queries. These are performance-killers. 2> Use reverseWildcardFactory to deal with leading wildcards 3> restrict

Re: No live SolrServers triggered by maxclausecount

2016-06-28 Thread Pablo Anzorena
Hi Erick, thanks for answering. I attached the image to the body so you can see it. Why do I need so many clauses? It is because I have two text fields that contains in average 25 words with a lot of typos (which I'm not cleaning it) and on top of that the index consists of 25 million records.

RE: Beginer's questions

2016-06-28 Thread Kostas
Hello Alexandre. Thanks for the help. 5) That Bug report seems very interesting. Thanks. :) I will try that and let's hope the server requires the client to have a certificate, unlike the current behavior I am getting when specifying the certificate settings on Jetty. 6) For the client certifica

Problem starting solr

2016-06-28 Thread Mary Anne Smart
I've downloaded solr and updated my java version to Java 8. However, when I try to start solr with bin/solr start, I get the following error: Marys-MacBook-Air:solr-5.3.1 masmart$ bin/solr start Waiting up to 30 seconds to see Solr running on port 8983 [-] Still not seeing Solr listening on 898

Re: Solr PhraseQuery With Wildcard

2016-06-28 Thread Felipe Vinturini
Hi Erick, Thanks for your comments! In fact, I started with Solr one month ago, so I am still learning! =) I understand the differences between the Solr tokenizers, but there are so many options that take some time to find the one that fits our need. I found a solution to my problem with the con

RE: Beginer's questions

2016-06-28 Thread Alexandre Drouin
Hi, I'm also a Solr beginner but I think I can answer a few of your questions: 5 - There's a bug in solr.cmd related to SSL where the settings defined in solr.in.cmd are ignored. You can see SOLR-8491 (https://issues.apache.org/jira/browse/SOLR-8491) for more information and fix. 6- From m

Solr 6.1.0 CDCR SOURCE-TARGET-TARGET configuration - docs don't reach last PASSIVE cloud

2016-06-28 Thread dmitry.medvedev
Hi, I've a setup of 3 clouds: ACTIVE -PASSIVE-PASSIVE, here is the slice of my solrconfig.xml file: 10.36.75.4:9983,10.88.52.219:9983 demo demo

RE: Beginer's questions

2016-06-28 Thread Kostas
Regarding the SSL question, it fails when I try this too : solr start -c -V ^ -Dsolr.ssl.checkPeerName=false ^ -Djavax.net.ssl.keyStorePassword=password ^ -Djavax.net.ssl.trustStorePassword=password ^ -Djavax.net.ssl.keyStore="F:/Users/me/Downloads/solr-6.1.0/server/etc/solr-s sl.keystore.jks

RE: Solr6 CDCR issue with a 3 cloud design

2016-06-28 Thread dmitry.medvedev
No ERRORS and queue size is equal to 0. Should I extend the logging lever to Max maybe? Currently it's default. How can I know, if a commit operation has been sent to the 2 target clusters after the replication? What command should I run to check this? I submit new doc/s to my ACTIVE/PRIMARY clou

Re: Streaming Expressions (/stream) StreamHandler java.lang.NullPointerException

2016-06-28 Thread Dennis Gove
I've not been able to replicate the null pointer exception being seen. I created a new collection called EventsAndDCF with 4 shards and 3 replicas using a simple conf $> /tmp/solr-go/bin/solr/bin/solr create -p 30001 -c EventsAndDCF -d ../../../test/main/conf/sample -n EventsAndDCF -shards 4 -repl

ANNOUNCE: Solr Reference Guide for 6.1 Released

2016-06-28 Thread Cassandra Targett
The Lucene PMC is pleased to announce that the Solr Reference Guide for 6.1 has been released. The 700 page PDF is the definitive guide to Solr. It can be downloaded from: https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/apache-solr-ref-guide-6.1.pdf

Bootstrapping a Solr Cloud

2016-06-28 Thread Dennis Gove
Something I find myself doing a lot is bringing up a brand new Solr Cloud to test out some ideas I have. To simplify the process of getting Solr and Zookeeper builds, installing, starting up multiple instances of each (including data and log file directory management), I've created a (hopefully) si

Beginer's questions

2016-06-28 Thread Kostas
Hello. I have a ton of questions that I could use some answers. If someone can answer some of them it would be great. 1) I had problems making my Solr 6.1 setup use a fixed collection schema. When I placed the schema.xml file in the filesystem as shown here

Re: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many bytes written

2016-06-28 Thread Shalin Shekhar Mangar
This was fixed in 5.5 and 6.0. You can upload files larger than 2GB with the simple post tool however I don't recommend it because it uses a single indexing thread. On Tue, Jun 28, 2016 at 3:55 PM, Toke Eskildsen wrote: > On Mon, 2016-06-27 at 13:24 +, Rajendran, Prabaharan wrote: > > I am t

Re: SimplePostTool: FATAL: IOException while posting data: java.io.IOException: too many bytes written

2016-06-28 Thread Toke Eskildsen
On Mon, 2016-06-27 at 13:24 +, Rajendran, Prabaharan wrote: > I am trying to index a text file about 4.2 GB in size. [...] > > SimplePostTool: FATAL: IOException while posting data: java.io.IOException: > too many bytes written SimplePostTool uses HttpUrlConnection.setFixedLengthStreamingMo

Highlighting in UAX29URLEmailTokenizerFactory

2016-06-28 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, it is possible for highlighting to be done on fields that are indexed under UAX29URLEmailTokenizerFactory? I'm using the UAX29URLEmailTokenizerFactory for fields that store the email addresses. It is able to do the search correctly, but just that it is not able to highlig

wiki.apache.org - Service Unavailable

2016-06-28 Thread Vincenzo D'Amore
Hi, If someone able to recover wiki.apache.org site read the forum. I want notify that https://wiki.apache.org/ is down from yesterday. Service Unavailable > The server is temporarily unable to service your request due to > maintenance downtime or capacity problems. Please try again later. > Apa

RE: Positions files analysis

2016-06-28 Thread Avi Steiner
Thanks Eric. I don't want to disable the phrase searches option. I just wonder if there is any way I can find terms within index, and thought the pos file analysis may be a direction. I suspect that our index is full of long float numbers (i.e: 1234.4546786585899544) which may be unnecessary. Be

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Micheal Cooper
The very cool people at Atilika, the company that donates the JapaneseTokenizer to Lucene and Solr, just sent me a great slidedeck that you should see if you are interested in Japanese search: https://speakerdeck.com/atilika/japanese-linguistics-in-lucene-and-solr Micheal On 2016/06/28, 17:03,

RE: limit stored field size

2016-06-28 Thread Avi Steiner
As far as I remember it didn't work. I used DefaultSolrHighlighter (because the improved ones as PostingsSolrHighlighter requires indexing for sure) and it used only indexed fields, but I'll retry. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, Ju

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Micheal Cooper
Very nice. Thank you. My non-Japanese devs had set Solr to use CJK for indexing and Whitespace Tokenizer for search, which does not work at all because Japanese does not use whitespace. I was able to find settings that seem to be working well. For reference for other knowledge-seekers: I cont

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Alexandre Rafalovitch
Have you seen http://discovery-grindstone.blogspot.com.au/ ? It is a series of articles on setting up SJK for library content. Regards, Alex. Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 28 June 2016 at 10:59, Micheal Cooper wrote: > I hav

Solr Post Filter: Mixing frange and Boolean Queries

2016-06-28 Thread Vasu Y
Hi, I am trying to apply a post filter by using frange and in the same filter, I also need to use boolean queries. When I try this, I get Syntax error. Is there a way to achieve this without writing my own Post Filter Class? Here is what the query & fq looks like: q=Name:test* fq={!frange l=0 u=0