Re: problem with solr auto add core after restart

2015-10-30 Thread sara hajili
no error occurred,at least as i see in solr log.solr nothing writes in SOLR_HOME/LOGS. what can i do now? On Thu, Oct 29, 2015 at 1:49 PM, Erick Erickson wrote: > What errors, if any, do you see in the Solr logs? The information here > isn't > enough to say much. > > Best, > Erick > > On Thu, O

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Robert Oschler
Thanks Walter. I believe I have what I need now. Have a great weekend. On Fri, Oct 30, 2015 at 11:13 PM, Walter Underwood wrote: > Read the links I have sent. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Oct 30, 2015, at 7:10 PM

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Walter Underwood
Read the links I have sent. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 30, 2015, at 7:10 PM, Robert Oschler wrote: > > Thanks Walter. Are there any open source spell checkers that implement the > Peter Norvig or Damerau-Levenshtein algori

Re: restore quorum after majority of zk nodes down

2015-10-30 Thread Pushkar Raste
We need bounce it, but outage will be very short and you don't have to take down rest of the zookeeper instances. On 30 October 2015 at 11:00, Daniel Collins wrote: > Aren't you asking for dynamic ZK configuration which isn't supported yet > (ZOOKEEPER-107, only in in 3.5.0-alpha)? How do you s

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Robert Oschler
Thanks Walter. Are there any open source spell checkers that implement the Peter Norvig or Damerau-Levenshtein algorithms? I'm short on time so I have to keep the custom coding down to a minimum. On Fri, Oct 30, 2015 at 8:02 PM, Walter Underwood wrote: > Dedicated spell-checkers have better a

Re: Problem with the Content Field during Solr Indexing

2015-10-30 Thread Shruti Mundra
Hi Edwin, The file extension of the image file is ".png" and we are following this url for indexing: " http://blog.thedigitalgroup.com/vijaym/wp-content/uploads/sites/11/2015/07/SolrImageExtract.png " Thanks and Regards, Shruti Mundra On Thu, Oct 29, 2015 at 8:33 PM, Zheng Lin Edwin Yeo wrote:

Re: Sort not working as expected

2015-10-30 Thread Erick Erickson
bq: Is there no way that the existing field can be used? In a word, "no". The indexed terms are being used for sorting. You have a document that has the title "aardvark zebra". The actual _tokens_ are aardvark zebra solr/Lucene has no way of knowing whether these should be sorted by "a" or "z".

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Walter Underwood
Dedicated spell-checkers have better algorithms than Solr. They usually handle transposed characters as well as inserted, deleted, or substituted characters. This is an enhanced version of Levinshtein distance. It is called Damerau-Levenshtein and is too expensive to use in Solr search. Spell c

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Robert Oschler
Hello Walter and Mikhail, Thank you for your answers. Do those spell checkers have the same or better fuzzy matching capability that SOLR/Lucene has (Lichtenstein, max distance 2)? That's a critical requirement for my application. I take it by your suggestion of these spell checker apps they ca

Re: Solr Keyword query on a specific field.

2015-10-30 Thread davidphilip cherian
>> "Is there any way to have a single field search use the same keyword search logic as the default query?" Do a phrase search, with double quotes surrounding the multiple keywords, it should work. Try q=title:("Test Keywords") You could possibly try adding this q.op as local param to query as sh

Re: Sort not working as expected

2015-10-30 Thread davidphilip cherian
You can create a copy field with string type and make it copy from this existing field, and sort on this new one. That way, you can still continue doing text search on existing one and sort on this new field. On Fri, Oct 30, 2015 at 3:04 PM, Brian Narsi wrote: > Is there no way that the exis

Re: Using Nutch Segments

2015-10-30 Thread Imtiaz Shakil Siddique
You can check your solr admin panel . it should be like http://localhost:8983/solr/ >From there go -->your solr core-->query Inside the query box type *:* Then solr will display 10 documents from its index. You can check the fields and its contents. Solr searches in the text field out of the box.

Re: Solr 5.3.1 CREATE defaults to schema-less mode Java version 1.7.0_45

2015-10-30 Thread Upayavira
On Fri, Oct 30, 2015, at 07:03 PM, natasha wrote: > Hi Erick, > > If I just run the following, I have no issue: > > bin/solr start > curl ' > http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&instanceDir=/home/natasha/twc-session-dash1/collection1 >

Solr Keyword query on a specific field.

2015-10-30 Thread Aaron Gibbons
Is there any way to have a single field search use the same keyword search logic as the default query? I define q.op as AND in my query which gets applied to any main keywords but any keywords I'm trying to use within a field do not get the same logic applied. Example: q=(title:(Test Keywords)) the

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Mikhail Khludnev
Perhaps FileBasedSpellChecker https://cwiki.apache.org/confluence/display/solr/Spell+Checking On Fri, Oct 30, 2015 at 9:37 PM, Robert Oschler wrote: > Hello everyone, > > I have a gigantic list of industry terms that I want to import into a > Solr/Lucene instance running on an AWS box. What is

Re: Sort not working as expected

2015-10-30 Thread Brian Narsi
Is there no way that the existing field can be used? On Fri, Oct 30, 2015 at 1:42 PM, Ray Niu wrote: > you should use string type instead of text if you want to sort > alphabetically > > 2015-10-30 11:12 GMT-07:00 Brian Narsi : > > > I have a fieldtype setup as > > > > positionIncrementGap= >

Re: Solr 5.3.1 CREATE defaults to schema-less mode Java version 1.7.0_45

2015-10-30 Thread natasha
Hi Erick, If I just run the following, I have no issue: bin/solr start curl ' http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&instanceDir=/home/natasha/twc-session-dash1/collection1

Re: Sort not working as expected

2015-10-30 Thread Ray Niu
you should use string type instead of text if you want to sort alphabetically 2015-10-30 11:12 GMT-07:00 Brian Narsi : > I have a fieldtype setup as > > "100"> "solr.StandardTokenizerFactory"/> "solr.LowerCaseFilterFactory"/> minGramSize="3" maxGramSize="25"/> < > tokenizer class="solr.Sta

Re: Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Walter Underwood
Is there some reason that you don’t want to use aspell with a custom dictionary? Lucene and Solr are pretty weak compared to purpose-built spelling checkers. http://aspell.net/ Also, consider the Peter Norvig spell corrector approach. With a fixed list, it is blazing fast.

Fastest way to import a giant word list into Solr/Lucene?

2015-10-30 Thread Robert Oschler
Hello everyone, I have a gigantic list of industry terms that I want to import into a Solr/Lucene instance running on an AWS box. What is the fastest way to import the list into my Solr/Lucene instance? I have admin/sudo privileges on the box. Also, is there a document that shows me how to set

Sort not working as expected

2015-10-30 Thread Brian Narsi
I have a fieldtype setup as < tokenizer class="solr.StandardTokenizerFactory"/> When I sort on this field type in ascending order I am not getting results sorted alphabetically as expected. Why is that? What should I do to get the sort on? Thanks

Re: Solr 5.3.1 CREATE defaults to schema-less mode Java version 1.7.0_45

2015-10-30 Thread natasha
Hi Erick, Thanks for your help. I am fairly new to Solr. I'm not set on using SolrCloud. No need for ZooKeeper or multiple leader nodes. What I have is an existing instanceDir (with a conf and data directory, with all requisite components) and I would like to create a new core based on this pree

Re: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks for your reply. Have you customized SignatureUpdateProcessorFactory or are you using the configuration out of the box ? I know it works for simple dedup, but my requirement is tad different as I need to tag an identifier to the latest document. My goal is to understand if that's possible usi

RE: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks Markus. I've been using field collapsing till now but the performance constraint is forcing me to think about index time de-duplication. I've been using a composite router to make sure that duplicate documents are routed to the same shard. Won't that work for SignatureUpdateProcessorFactory

Re: Question on index time de-duplication

2015-10-30 Thread shamik
Thanks Scott. I could directly use field collapsing on adskdedup field without the signature field. Problem with field collapsing is the performance overhead. It slows down the query to 10 folds. CollapsingQParserPlugin is a better option, unfortunately, it doesn't support ngroups equivalent, which

Re: growth of tlog

2015-10-30 Thread Shawn Heisey
On 10/30/2015 9:46 AM, Rallavagu wrote: > Also, this affects available physical memory as tlog continues to grow > and it is memory mapped. I think this is a common misconception. MMAP does *not* use up physical memory, at least not in the detrimental way your sentence suggests. Any memory (OS d

Re: growth of tlog

2015-10-30 Thread Rallavagu
On 10/30/15 8:39 AM, Erick Erickson wrote: I infer that this statement: "takes a while to recover before cloud becomes green" indicates that the node is in recovery or something while indexing. If you're still indexing, the new documents will be written to the followers tlog while the follower

Re: growth of tlog

2015-10-30 Thread Erick Erickson
I infer that this statement: "takes a while to recover before cloud becomes green" indicates that the node is in recovery or something while indexing. If you're still indexing, the new documents will be written to the followers tlog while the follower is recovering, leading to it growing. I expect

growth of tlog

2015-10-30 Thread Rallavagu
4.10.4 solr cloud, 3 zk quorum, jdk 8 autocommit: 15 sec, softcommit: 2 min Under heavy indexing load with above settings, i have seen tlog growing (into GB). After the updates stopped coming in, it settles down and takes a while to recover before cloud becomes "green". With 15 second autoco

Re: restore quorum after majority of zk nodes down

2015-10-30 Thread Daniel Collins
Aren't you asking for dynamic ZK configuration which isn't supported yet (ZOOKEEPER-107, only in in 3.5.0-alpha)? How do you swap a zookeeper instance from being an observer to a voting member? On 30 October 2015 at 09:34, Matteo Grolla wrote: > Pushkar... I love this solution > thanks >

Re: Securing field level access permission by filtering the query itself

2015-10-30 Thread Douglas McGilvray
Scott thanks for the reply. I like the idea of mapping all the fieldnames internally, adding security through obscurity. My question therefore would be what is the definitive list of query parameters that one must filter to ensure a particular field is not exposed in the query response? Am I mi

Re: How to get values of external file field(s) in Solr query?

2015-10-30 Thread chitrapatel
I have implemented ExternalFileField in solr. But I am not able to write values into external file "external_BestSellerTest" that is located into ~\data directory inside solr core. My application and Solr Server were physically separated on two place. Application will calculate a score and generat

Re: SolrJ stalls/hangs on client.add(); and doesn't return

2015-10-30 Thread Erick Erickson
Glad you can solve it one way or the other. I do wonder, though what's really going on, the fact that your original case just hung is kind of disturbing. 50K is still a lot, and Yonik's comment is well taken. I did some benchmarking (not ConcurrentUpdateSolrServer, HttpSolrClient as I remember) an

Re: SolrJ stalls/hangs on client.add(); and doesn't return

2015-10-30 Thread Susheel Kumar
Just a suggestion Markus that sending 50k documents in your case worked but you may want to benchmark sending batches in 5K, 10k or 20k batches and compare with sending 50k batches. It may turn out that smaller batch size may be faster than very big batch size... On Fri, Oct 30, 2015 at 7:59 AM,

Re: SolrJ stalls/hangs on client.add(); and doesn't return

2015-10-30 Thread Yonik Seeley
On Thu, Oct 29, 2015 at 5:28 PM, Erick Erickson wrote: > Try making batches of 1,000 docs and sending them through instead. The other thing about ConcurrentUpdateSolrClient is that it will create batches itself while streaming. For example, if you call add a number of times very quickly, those w

RE: SolrJ stalls/hangs on client.add(); and doesn't return

2015-10-30 Thread Markus Jelsma
Hi - Solr doesn't seem to receive anything, and it certainly doesn't log anything, nothing is running out of memory. Indeed, i was clearly misunderstanding ConcurrentUpdateSolrClient. I hoped, without reading its code, it would partition input, which it clearly doesn't. I changed the code to pa

RE: Question on index time de-duplication

2015-10-30 Thread Markus Jelsma
Hello - keep in mind that both SignatureUpdateProcessorFactory and field collapsing do not work in distributed search unless you map identical signatures to identical shards. Markus -Original message- > From:Scott Stults > Sent: Friday 30th October 2015 11:58 > To: solr-user@lucene.apa

Re: org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: id

2015-10-30 Thread Mikhail Khludnev
Try to remove http://wiki.apache.org/solr/SchemaXml#The_Unique_Key_Field On Fri, Oct 30, 2015 at 12:51 PM, fabigol wrote: > Hi, > great thank for your replies. > I undated that you said me the same thing. Is it right? > In some record, it is missing the id_tiers? > > I have a question, how is it

Re: Performance degradation with two collection on same sole instance

2015-10-30 Thread SolrUser1543
we have 100 gb ram on each machine . 20 gb - for heap . index size of big collection is 130 gb . the new second collection has only few documents , only few MB . When we disabled new cores , performance has improved . Both collection using same solr.config , so they have same filter configurati

Re: Question on index time de-duplication

2015-10-30 Thread Scott Stults
At the top of the De-Duplication wiki page is a note about collapsing results. Once you have the signature (identical for each of the duplicates) you'll want to collapse your results, keeping the one with max date. https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results k/r,

Re: Performance degradation with two collection on same sole instance

2015-10-30 Thread Toke Eskildsen
On Tue, 2015-10-27 at 12:12 -0700, SolrUser1543 wrote: > The question is , how Solr manages its resources when it has more than one > core ? Does it need twice memory ? Or this degradation might be a > coincidence ? There is an overhead for each core, but not much. You should not notice any pe

Re: Securing field level access permission by filtering the query itself

2015-10-30 Thread Scott Stults
Douglas, Managing a per-user-group whitelist of fields outside of Solr seems the best approach. When the query comes in you can then filter out any fields not contained in the whitelist before you send the request to Solr. The easy part will be to do that on URL parameters like fl. Depending on ho

Re: Performance degradation with two collection on same sole instance

2015-10-30 Thread Jan Høydahl
You say you configure 20Gb heap. What is your total physical RAM on the host? What are your cache sizes for the two collections? If you have too high cache settings you may eat too much memory.. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 27. okt. 2015 kl. 20.12 s

Re: org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: id

2015-10-30 Thread fabigol
Hi, great thank for your replies. I undated that you said me the same thing. Is it right? In some record, it is missing the id_tiers? I have a question, how is it possible that the mapping does not work? I'm going to check my data (response request). I must have the field id_fields not empty? B

Re: restore quorum after majority of zk nodes down

2015-10-30 Thread Matteo Grolla
Pushkar... I love this solution thanks I'd just go with 3 zk nodes on each side 2015-10-29 23:46 GMT+01:00 Pushkar Raste : > How about having let's say 4 nodes on each side and make one node in one of > data centers a observer. When data center with majority of the nodes go > down, bounce t

Re: [Help]Solr_Not_Responding

2015-10-30 Thread Modassar Ather
The information given is not sufficient to conclude a cause. You can check the solr logs for details for any exception. Regards, Modassar On Fri, Oct 30, 2015 at 10:12 AM, Franky Parulian Silalahi < fra...@telunjuk.com> wrote: > I have problem with my solr and i run in centos 7. > sometime my s

[Help]Solr_Not_Responding

2015-10-30 Thread Franky Parulian Silalahi
I have problem with my solr and i run in centos 7. sometime my solr is detected as down, but when i check solr's service, that service is run. how it happen? and why ?