Re: Number of fields in schema.xml and impact on Solr

2015-04-22 Thread Steven White
Thanks Shawn. This is good to know. Steve On Wed, Apr 22, 2015 at 9:00 AM, Shawn Heisey elyog...@elyograg.org wrote: On 4/22/2015 6:33 AM, Steven White wrote: Is there anything I should be taking into consideration if I have a large number of fields in my Solr's schema.xml file? I

Number of fields in schema.xml and impact on Solr

2015-04-22 Thread Steven White
Hi Everyone Is there anything I should be taking into consideration if I have a large number of fields in my Solr's schema.xml file? I will be indexing records into Solr and as I create documents, each document will have between 20-200 fields. However, due to the natural of my data source, the

Re: Checking of Solr Memory and Disk usage

2015-04-22 Thread Zheng Lin Edwin Yeo
I see. I'm running on SolrCloud with 2 replicia, so I guess mine will probably use much more when my system reaches millions of documents. Regards, Edwin On 22 April 2015 at 20:47, Shawn Heisey apa...@elyograg.org wrote: On 4/22/2015 12:11 AM, Zheng Lin Edwin Yeo wrote: Roughly how many

Odp.: Suggester

2015-04-22 Thread LAFK
For the sake of others who would look for the solution and stumble upon this thread, consider sharing.  I'd expect Solr to return whole field, if it's a text block then that's it.  @LAFK_PL   Oryginalna wiadomość   Od: Martin Keller Wysłano: środa, 22 kwietnia 2015 16:36 Do:

Re: MLT causing Problems

2015-04-22 Thread Erick Erickson
Anything more informative in the Solr logs? Best, Erick On Wed, Apr 22, 2015 at 2:45 AM, Srinivas Rishindra sririshin...@gmail.com wrote: Hello, I am working on a project in which i have to find similar documents. While I implementing the following error is occurring. Please let me know

After language detection is enabled, SOLR (5.1) isn't indexing anything

2015-04-22 Thread Angel Todorov
Hi guys, I've enabled language detection in solrconfig.xml: updateRequestProcessorChain name=langid processor class= org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory lst name=defaults str name=langid.flcontent,title/str str

Re: Boolean filter query not working as expected

2015-04-22 Thread Jack Krupansky
A purely negative sub-query is not supported by Lucene - you need to have at least one positive term, such as *:*, at each level of sub-query. Try: ((*:* -(field:V1) AND -(field:V2)) AND -(field:V3)) -- Jack Krupansky On Wed, Apr 22, 2015 at 10:56 AM, Dhutia, Devansh ddhu...@gannett.com wrote:

Re: Odp.: solr issue with pdf forms

2015-04-22 Thread Erick Erickson
Are they not _indexed_ correctly or not being displayed correctly? Take a look at admin UIschema browser your field and press the load terms button. That'll show you what is _in_ the index as opposed to what the raw data looked like. When you return the field in a Solr search, you get a verbatim,

Re: Document Created Date

2015-04-22 Thread Eric Meisler
Sorry if my question was too vague. In my mind it wasn't but you led me in the right direction which gave me a new issue. I added the following to my schema.xml to bring back the Created Date: field name=created type=date indexed=false stored=true/ but now I am getting back the created

Re: Number of fields in schema.xml and impact on Solr

2015-04-22 Thread Shawn Heisey
On 4/22/2015 6:33 AM, Steven White wrote: Is there anything I should be taking into consideration if I have a large number of fields in my Solr's schema.xml file? I will be indexing records into Solr and as I create documents, each document will have between 20-200 fields. However, due to

Solr Error Message ShutDown

2015-04-22 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
Hi , We are having an issue without PROD environment and its say below message when we access solr using browser.. HTTP Status 503 - Server is shutting down or failed to initialize type Status report message Server is shutting down or failed to initialize

Re: Checking of Solr Memory and Disk usage

2015-04-22 Thread Shawn Heisey
On 4/22/2015 12:11 AM, Zheng Lin Edwin Yeo wrote: Roughly how many collections and how much records do you have in your Solr? I have 8 collections with a total of roughly 227000 records, most of which are CSV records. One of my collections have 142000 records. The core that shows 82MB for

Getting error while searching meaningless words

2015-04-22 Thread Eray Ince
Hello There, We are using hybris with SOLR (4.6.1) I checked the https://issues.apache.org/jira/browse/SOLR-6563 and saw that problem has been solved. However we are still getting same problem on standalone server. There is no problem on embedded server. Is there any idea? You can find log file

Re: Odp.: phraseFreq vs sloppyFreq

2015-04-22 Thread Dmitry Kan
LAFK, Yes, or even more, than 1k. Based on sloppyFreq component (hopefully, same as phraseFreq) we get documents where keywords occur near each other ranked higher. As if we used slop=10 or something. On Wed, Apr 22, 2015 at 2:59 PM, LAFK tomasz.bo...@gmail.com wrote: Out of curiosity, why

Highlighting in Solr

2015-04-22 Thread Zheng Lin Edwin Yeo
Hi, I'm currently implementing highlighting on my Solr-5.0.0. When I issue the following command: http://localhost:8983/solr/collection1/select?q=conducted http://localhost:8983/solr/edmtechnical/select?q=conducted hl=truehl.fl=Content,Summarywt=jsonindent=truerows=10, the highlighting result is

Boolean filter query not working as expected

2015-04-22 Thread Dhutia, Devansh
I have an automated filter query builder that uses the SolrNet nuget package to build out boolean filters. I have a scenario where it is generating a fq in the following format: ((-(field:V1) AND -(field:V2)) AND -(field:V3)) The filter looks legal to me (albeit with extra parentheses), but the

Re: Boolean filter query not working as expected

2015-04-22 Thread Dhutia, Devansh
If I upgrade to using the edismax parser in my fq, I get the desired results. The default lucene parser on fq must not be able to parse the more complex nested clauses q=*:*fq={!type=edismax}((-(field:V1) AND -(field:V2)) AND -(field:V3)) - Works On 4/22/15, 3:27 PM, Dhutia, Devansh

no subject

2015-04-22 Thread Bill Tsay
On 4/22/15, 7:36 AM, Martin Keller martin.kel...@unitedplanet.com wrote: OK, I found the problem and as so often it was sitting in front of the display. Now the next problem: The suggestions returned consist always of a complete text block where the match was found. I would have expected a

Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread Walter Underwood
text/xml is not a safe content-type, because of the way that HTTP handles charsets. Always use application/xml. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Apr 22, 2015, at 3:01 AM, bengates benga...@aliceadsl.fr wrote: Looks like Solarium

Re: Boolean filter query not working as expected

2015-04-22 Thread Dhutia, Devansh
I don’t know if that’s completely true, or maybe I’m misunderstanding something. If it doesn’t support purely negative subqueries, this shouldn't work, but does: q=*:*fq=(-(field:V1)) However, for me, the following is a summary of what works what doesn’t. q=*:*fq=(-(field:V1))

Re: solr issue with pdf forms

2015-04-22 Thread Dan Davis
Steve, Are you using ExtractingRequestHandler / DataImportHandler or extracting the text content from the PDF outside of Solr? On Wed, Apr 22, 2015 at 6:40 AM, steve.sch...@t-systems.com wrote: Hi guys, hopefully you can help me with my issue. We are using a solr setup and have the

Re: Odp.: solr issue with pdf forms

2015-04-22 Thread Dan Davis
+1 - I like Erick's answer. Let me know if that turns out to be the problem - I'm interested in this problem and would be happy to help. On Wed, Apr 22, 2015 at 11:11 AM, Erick Erickson erickerick...@gmail.com wrote: Are they not _indexed_ correctly or not being displayed correctly? Take a

Re: Suggester

2015-04-22 Thread Martin Keller
OK, I found the problem and as so often it was sitting in front of the display. Now the next problem: The suggestions returned consist always of a complete text block where the match was found. I would have expected a single word or a small phrase. Thanks in advance Martin Am 22.04.2015 um

Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread didier deshommes
A similar problem seems to happen when sending application/json to the search handler. Solr returns a NullPointerException for some reason: vagrant@precise64:~/solr-5.1.0$ curl http://localhost:8983/solr/gettingstarted/select?wt=jsonindent=trueq=foundation; -H Content-type:application/json {

Re: Document Created Date

2015-04-22 Thread Erick Erickson
The generic problem with all the semi-structured documents is that the meta-data has no consistent naming. Making up names here, but Word might have created_on, PDF created etc. Its really frustrating, but each type has to be investigated to figure out which field you want to map to created. Tika

Re: Solr Error Message ShutDown

2015-04-22 Thread Erick Erickson
What version of Solr? And do the Solr logs show anything useful? Or catalina.out? Best, Erick On Wed, Apr 22, 2015 at 7:23 AM, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS) external.ravi.tamin...@us.bosch.com wrote: Hi , We are having an issue without PROD environment and its say below message

Re: Odp.: Suggester

2015-04-22 Thread Erick Erickson
Right, this is what the suggester you're using is built for. Which is actually way cool for certain situations. Try the FreeTextLookupFactory (warning, I'm not too familiar with the nuances here) Or maybe spelling suggestions are more what you're looking for which look at the terms and return a

Re: Checking of Solr Memory and Disk usage

2015-04-22 Thread Zheng Lin Edwin Yeo
Roughly how many collections and how much records do you have in your Solr? I have 8 collections with a total of roughly 227000 records, most of which are CSV records. One of my collections have 142000 records. Regards, Edwin On 22 April 2015 at 13:49, Shawn Heisey apa...@elyograg.org wrote:

Suggestion in Solr Cloud

2015-04-22 Thread Swaraj Kumar
Hi All, I want to use suggest option in solr but my SOLR is in cloud mode hence to get the suggestion every time in query I need to provide shard url with it like below:-

Re: Solr 4.10.x regression in map-reduce contrib

2015-04-22 Thread Shenghua(Daniel) Wan
I got same issue when using 4.10.2. I suspected this issue will cause trouble when using too many reducers. Then I tried to use less reducers, and made it work. I do not think map-reduce contrib in this version is stable... Anyway it is free. On Tue, Apr 21, 2015 at 10:56 PM, ralph tice

Re: Complete list of field type that Solr supports

2015-04-22 Thread Chris Hostetter
: To be clear, here is an example of a type from Solr's schema.xml: : : field name=weight type=float indexed=true stored=true/ : : Here, the type is float. I'm looking for the complete list of : out-of-the-box types supported. what you are asking about are just symbolic names that come

Re: Complete list of field type that Solr supports

2015-04-22 Thread Chris Hostetter
: I'm confused. If type=float is just a symbolic name, how does Solr knows : to index the data of field weight as float? What about for date per : this example: : : field name=last_modified type=date indexed=true stored=true/ : : How does Solr applies date-range queries such as: because

Complete list of field type that Solr supports

2015-04-22 Thread Steven White
Hi Everyone, I Googled for this with no luck. Where can I find a complete list of field type that Solr supports? In the sample scheam.xml that comes with Solr 5 and prior version, I am able to compile a list such as boolean, float, string, etc. but I cannot find a complete list documented

Re: Complete list of field type that Solr supports

2015-04-22 Thread Steven White
Hi Hoss, I'm confused. If type=float is just a symbolic name, how does Solr knows to index the data of field weight as float? What about for date per this example: field name=last_modified type=date indexed=true stored=true/ How does Solr applies date-range queries such as:

Re: Boolean filter query not working as expected

2015-04-22 Thread Chris Hostetter
1) https://lucidworks.com/blog/why-not-and-or-and-not/ 2) use debug=query to understand how your (filter) query is being parsed. : Date: Wed, 22 Apr 2015 14:56:22 + : From: Dhutia, Devansh ddhu...@gannett.com : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org

Re: Complete list of field type that Solr supports

2015-04-22 Thread Steven White
I got it now. I have to start from fieldType/ to create my field/ list. If I want a list of supported field-types (used in my schema.xml), I have to look at the class attribute of fieldType/ to get that list. The out-of-the-box list of field-types is documented in the link you provided:

Re: rq breaks wildcard search?

2015-04-22 Thread Ryan Josal
Awesome thanks! I was on 4.10.2 Ryan On Apr 22, 2015, at 16:44, Joel Bernstein joels...@gmail.com wrote: For your own implementation you'll need to implement the following methods: public Query rewrite(IndexReader reader) throws IOException public void extractTerms(SetTerm terms) You

Re: rq breaks wildcard search?

2015-04-22 Thread Joel Bernstein
Just confirmed that wildcard queries work with Re-Ranking following SOLR-6323. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Apr 22, 2015 at 7:26 PM, Joel Bernstein joels...@gmail.com wrote: This should be resolved in https://issues.apache.org/jira/browse/SOLR-6323 . Solr 4.10.3

Re: rq breaks wildcard search?

2015-04-22 Thread Joel Bernstein
This should be resolved in https://issues.apache.org/jira/browse/SOLR-6323. Solr 4.10.3 Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Apr 15, 2015 at 6:23 PM, Ryan Josal rjo...@gmail.com wrote: Using edismax, supplying a rq= param, like {!rerank ...} is causing an

Re: rq breaks wildcard search?

2015-04-22 Thread Joel Bernstein
For your own implementation you'll need to implement the following methods: public Query rewrite(IndexReader reader) throws IOException public void extractTerms(SetTerm terms) You can review the 4.10.3 version of the ReRankQParserPlugin to see how it implements these methods. Joel Bernstein

RE: Solr Index data lost

2015-04-22 Thread Vijay Bhoomireddy
Just to close this thread – It looks like it’s working fine now. Not sure what mistake I had done last time. But now, the index data is still persistent on the pen drive even after server shutdown and restarting it on a different machine where the pen drive is plugged in. Thanks for all

TIKA OCR not working

2015-04-22 Thread trung.ht
Hi, I want to use solr to index some scanned document, after settings solr document with a two field content and filename, I tried to upload the attached file, but it seems that the content of the file is only \n \n \n. But if I used the tesseract from command line I got the result correctly.

AW: Odp.: solr issue with pdf forms

2015-04-22 Thread Steve.Scholl
Thanks for your answer. Maybe my English is not good enough, what are you trying to say? Sorry I didn't get the point. :-( -Ursprüngliche Nachricht- Von: LAFK [mailto:tomasz.bo...@gmail.com] Gesendet: Mittwoch, 22. April 2015 14:01 An: solr-user@lucene.apache.org;

Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread Yonik Seeley
On Wed, Apr 22, 2015 at 11:00 AM, didier deshommes dfdes...@gmail.com wrote: curl http://localhost:8983/solr/gettingstarted/select?wt=jsonindent=trueq=foundation; -H Content-type:application/json You're telling Solr the body encoding is JSON, but then you don't send any body. We could catch

phraseFreq vs sloppyFreq

2015-04-22 Thread Dmitry Kan
Hi guys. I'm executing the following proximity query: leader the~1000. In the debugQuery I see phraseFreq=0.032258064. Is phraseFreq same thing as sloppyFreq from https://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/DefaultSimilarity.html ? Do higher phraserFreq

solr issue with pdf forms

2015-04-22 Thread Steve.Scholl
Hi guys, hopefully you can help me with my issue. We are using a solr setup and have the following issue: - usual pdf files are indexed just fine - pdf files with writable form-fields look like this: Ich�bestätige�mit�meiner�Unterschrift,�dass�alle�Angaben�korrekt�und�vollständig�sind Somehow

Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread bengates
Looks like Solarium hardcodes a default header Content-Type: text/xml; charset=utf-8 if none provided. Removing it solves the problem. It seems that Solr 5.1 doesn't support this content-type. -- View this message in context:

MLT causing Problems

2015-04-22 Thread Srinivas Rishindra
Hello, I am working on a project in which i have to find similar documents. While I implementing the following error is occurring. Please let me know what to do. Exception in thread main org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at

Re: Suggester

2015-04-22 Thread Martin Keller
Unfortunately, setting suggestAnalyzerFieldType to text_suggest didn’t change anything. The suggest dictionary is freshly built. As I mentioned before, only words or phrases of the source field „content“ are not matched. When querying the index, the response only contains „suggestions“ field

Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread bengates
Hello, I've got the same issue after an upgrade from Solr 5.0 to 5.1, even on GET requests. Actually i'm using PHP Solarium library to perform my requests. This is the error the library gets now, on a search handler. The request is transported with cUrl. What's weird is when I copy/paste the

Exception while using group with timeAllowed on SolrCloud

2015-04-22 Thread forest_soup
We have the same issue as this JIRA. https://issues.apache.org/jira/browse/SOLR-6156 I have posted my query, response and solr logs to the JIAR. Could anyone please take a look? Thanks! -- View this message in context:

Odp.: phraseFreq vs sloppyFreq

2015-04-22 Thread LAFK
Out of curiosity, why proximity 1k? @LAFK_PL   Oryginalna wiadomość   Od: Dmitry Kan Wysłano: środa, 22 kwietnia 2015 09:26 Do: solr-user@lucene.apache.org Odpowiedz: solr-user@lucene.apache.org Temat: phraseFreq vs sloppyFreq Hi guys. I'm executing the following proximity query: leader

Odp.: solr issue with pdf forms

2015-04-22 Thread LAFK
Out of my head I'd follow how are writable PDFs created and encoded. @LAFK_PL   Oryginalna wiadomość   Od: steve.sch...@t-systems.com Wysłano: środa, 22 kwietnia 2015 12:41 Do: solr-user@lucene.apache.org Odpowiedz: solr-user@lucene.apache.org Temat: solr issue with pdf forms Hi guys, hopefully