Re: SlrCloud RAM requirments

2014-09-25 Thread Shawn Heisey
On 9/24/2014 2:18 AM, Toke Eskildsen wrote: Norgorn [lsunnyd...@mail.ru] wrote: I have CLOUD with 3 nodes and 16 MB RAM on each. My index is about 1 TB and search speed is awfully bad. We all have different standard with regards to search performance. What is awfully bad and what is good

Using SolrCloud on Amazon EC2

2014-09-25 Thread Timo Schmidt
Hi together, we currently plan to setup a project based on solr cloud and amazon webservices. Our main search application is deployed using aws opsworks which works out quite good. Since we also want to provision solr to ec2 i want to ask for experiences with the different

Help needed in Indexing and Search on xml content

2014-09-25 Thread sangeetha.subraman...@gtnexus.com
Hi Team, I am a newbie to SOLR. I have got search fields stored in a xml file which is stored in MSSQL. I want to index on the content of the xml file in SOLR. We need to provide search based on the fields present in the XML file. The reason why we are storing the input details as XML file is

(auto)suggestions, but ony from a filtered set of documents

2014-09-25 Thread Clemens Wyss DEV
What I'd like to do is http://localhost:8983/solr/solrpedia/suggest?q=atmqf=source:mysource Through qf (or however the parameter shall be called) I'd like to restrict the suggestions to documents which fit the given qf-query. I need this filter if (as posted in a previous thread) I intend to

Re: SlrCloud RAM requirments

2014-09-25 Thread Toke Eskildsen
On Thu, 2014-09-25 at 06:29 +0200, Norgorn wrote: I can't say for sure, cause filter caches are out of the JVM (dat HS), but top shows 5 GB cached and no free RAM. The cached reported from top should be correct, no matter if one used off-heap or not: You have 5GB for (I guess) 300MB index, so

Re: traversing Automaton in lucene 4.10

2014-09-25 Thread Dmitry Kan
case solved, example of traversal found in lucene's source code (pointed to by Mike McCandless): https://github.com/apache/lucene-solr/blob/2836bd99101026860b12233a87e35101769a538f/lucene/core/src/java/org/apache/lucene/util/automaton/Automaton.java#L535 On Fri, Sep 19, 2014 at 5:27 PM, Dmitry

/suggest through SolrJ?

2014-09-25 Thread Clemens Wyss DEV
Am I right that I cannot call /suggest (i.e. the corresponding RequestHandler) through SolrJ? What is the preferreded way to call Solr handlers/operations not supported by SolrJ from Java? Through new SolrJ Request-classes?

Turn off suggester

2014-09-25 Thread PeriS
Is there a way to turn off the solr suggester? I have about 30M records and when tomcat starts up, it takes a long time (~10 minutes) for the suggester to decompress the data or its doing soothing as it hangs on SolrSuggester.build(); Any ideas please? Thanks -Peri *** DISCLAIMER *** This

SolrCloud Slow to boot up

2014-09-25 Thread anand.mahajan
Hello all, Hosted a SolrCloud - 6 Nodes - 36 Shards x 3 Replica each - 108 cores across 6 servers. Moved in about 250M documents in this cluster. When I restart this cluster - only the leaders per shard comes up live instantly (within a minute) and all the replicas are shown as Recovering on the

Re: Scoring with wild cars

2014-09-25 Thread Jack Krupansky
The wildcard query is “constant score” to make it faster, so unfortunately that means there is no score differentiation between the wildcard matches. You can simple add the wildcard prefix as a separate query term and boost it: q=text:carre* text:carre^1.5 -- Jack Krupansky From: Pigeyre

Re: Help needed in Indexing and Search on xml content

2014-09-25 Thread PeriS
Hi Sangeetha, If you can tell me a little bit more about your setup, I can try and help. If you are on skype, that would be the easiest. Thanks -Peri On Sep 25, 2014, at 3:50 AM, sangeetha.subraman...@gtnexus.com wrote: Hi Team, I am a newbie to SOLR. I have got search fields stored in a

Re: Changed behavior in solr 4 ??

2014-09-25 Thread Jack Krupansky
I am not aware of any such feature! That doesn't mean it doesn't exist, but I don't recall seeing it in the Solr source code. -- Jack Krupansky -Original Message- From: Jorge Luis Betancourt Gonzalez Sent: Wednesday, September 24, 2014 1:31 AM To: solr-user@lucene.apache.org Subject:

point buffer returned as an elipse, how to configure?

2014-09-25 Thread Mark G
Solr team, I am indexing geographic points in dec degrees lat lon using the location_rpt type in my index. The type is setup like this fieldType name=location_rpt class=solr.SpatialRecursivePrefixTreeFieldType geo=true distErrPct=0.025 maxDistErr=0.09 units=degrees / my field

Solr stops in between indexing

2014-09-25 Thread madhav bahuguna
Hi, I have solr configured on google cloud server. When ever i try to index it ,it stops in between and shows and error connection lost,connection timeout. I have 2200 records some time it stops full indexing at 917 sometime 1385 sometime 2185. I have apache2 running on google cloud on debian OS.

Re: Help in selecting the appropriate feature to obtain results

2014-09-25 Thread Mikhail Khludnev
I call it 'reverse search' problem (regex indexing). It's almost impossible. You can - do it your own http://blog.mikemccandless.com/2013/06/build-your-own-finite-state-transducer.html - create http://lucene.apache.org/core/4_1_0/memory/org/apache/lucene/index/memory/MemoryIndex.html from the

Setting of Default Boost in Edismax Search Handler

2014-09-25 Thread O. Olson
I have a setup very similar to the /browse handler in the example (http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/example-DIH/solr/db/conf/solrconfig.xml?view=markup) I am curious if it is possible to set a default boost function (e.g. bf=log(qty)) , so that all query results would

Re: MRIT's morphline mapper doesn't co-locate with data

2014-09-25 Thread Tom Chen
Do you have the solr Jira number for the new ingestion tool? Thanks On Wed, Sep 24, 2014 at 7:57 PM, Wolfgang Hoschek whosc...@cloudera.com wrote: Based on our measurements, Lucene indexing is so CPU intensive that it wouldn’t really help much to exploit data locality on read. The

Solr and hadoop

2014-09-25 Thread Tom Chen
I wonder if Solr has InputFormat and OutputFormat like the EsInputFormat and EsOutputFormat that are provided by Elasticserach for Hadoop (es-hadoop). Is it possible for Solr to provide such integration with Hadoop? Best, Tom

Re: Changed behavior in solr 4 ??

2014-09-25 Thread Jorge Luis Betancourt Gonzalez
I haven’t used it before this, basically I found out about this in the Solr in Action book and guided by the comment about redefining the default components by defining a new searchComponent with the same name. Any how thanks for your reply! Regards, On Sep 25, 2014, at 8:01 AM, Jack

Re: Solr and hadoop

2014-09-25 Thread Michael Della Bitta
Yes, there's SolrInputDocumentWritable and MapReduceIndexerTool, plus the Morphline stuff (check out https://github.com/markrmiller/solr-map-reduce-example). Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street

Re: Solr Cloud Default Document Routing

2014-09-25 Thread Erick Erickson
Well, you've picked the absolute worst case for comparison. The increase to double digits is a constant overhead. IOW, let's say your query went from 5ms to 20 ms. That 15 ms is pretty much the additional overhead no matter what the query. This particular query just happens to be very fast in the

Re: SolrCloud Slow to boot up

2014-09-25 Thread Michael Della Bitta
1. What version of Solr are you running? 2. Have you made substantial changes to solrconfig.xml? Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions

Re: SolrCloud Slow to boot up

2014-09-25 Thread anand.mahajan
1. I've hosted it with Helios v 0.07 that ships with Solr 4.10 2. Change to solrconfig.xml - a. commits every 10 mins b. soft commits every 10 secs c. disabled all caches as the usage is very random (no end users only services doing the searches) and mostly single requests d. use cold

Re: Help needed in Indexing and Search on xml content

2014-09-25 Thread Aman Tandon
Hi, You can retrieve the data in xml format aswell in JSON. You need to learn about schema.xml, in this you define your fields which is present in your xml, on which fields you want to search,etc So it would be better to take a look at schema.xml, solr sample schema could clear.most of doubts.

Re: /suggest through SolrJ?

2014-09-25 Thread Erick Erickson
You can call anything from SolrJ that you can call from a URL. SolrJ has lots of convenience stuff to set particular parameters, parse the response, etc... But in the end it's communicating with Solr via a URL. Take a look at something like SolrQuery for instance. It has a nice command

Re: Turn off suggester

2014-09-25 Thread Erick Erickson
Well, tell us more about the suggester configuration, the number of unique terms in the field you're using, what version of Solr, etc. As Hoss says, details matter. Best, Erick On Thu, Sep 25, 2014 at 4:18 AM, PeriS peri.subrahma...@htcinc.com wrote: Is there a way to turn off the solr

Re: Solr stops in between indexing

2014-09-25 Thread Erick Erickson
If it was working fine and suddenly stopped, I have to ask what was the last thing that changed? Frankly it sounds like your network has started having some problems. Best, Erick On Thu, Sep 25, 2014 at 6:29 AM, madhav bahuguna madhav.bahug...@gmail.com wrote: Hi, I have solr configured on

Re: /suggest through SolrJ?

2014-09-25 Thread Shawn Heisey
On 9/25/2014 8:43 AM, Erick Erickson wrote: You can call anything from SolrJ that you can call from a URL. SolrJ has lots of convenience stuff to set particular parameters, parse the response, etc... But in the end it's communicating with Solr via a URL. Take a look at something like

Re: Solr and hadoop

2014-09-25 Thread Tom Chen
I'm aware of the MapReduceIndexerTool (MRIT). That might be solving the indexing part -- the OutputFormat part. But what I asked for is more on the making Solr index data available to Hadoop MapReduce -- making Solr as a data store like what HDFS can provide. With a Solr InputFormat, we can make

Solr mapred MTree merge stage ~6x slower in 4.10

2014-09-25 Thread Brett Hoerner
As an update to this thread, it seems my MTree wasn't completely hanging, it was just much slower in 4.10. If I replace 4.9.0 with 4.10 in my jar the MTree merge stage is 6x (or more) slower (in my case, 20 min becomes 2 hours). I hope to bisect this in the future, but the jobs I'm running take a

Re: Solr and hadoop

2014-09-25 Thread Joel Bernstein
Hi Tom, I am not aware of a Solr InputFormat implementation yet. The /export handier, which outputs entire sorted results sets, was designed to support these types of bulk export operations efficiently. I think a Solr InputFormat would be excellent project to begin working on. Also SOLR-6526 is

Why does the q parameter change?

2014-09-25 Thread eShard
Good afternoon all, I just implemented a phrase search and the parsed query gets changed from rapid prototyping to rapid prototype. I used the solr analyzer and prototyping was unchanged so I think I ruled out a tokenizer. So can anyone tell me what's going on? Here's the query: q=rapid

Re: Why does the q parameter change?

2014-09-25 Thread eShard
Ok, I think I'm on to something. I omitted this parameter which means it is set to false by default on my text field. I need to set it to true and see what happens... autoGeneratePhraseQueries=true If I'm reading the wiki right, this parameter if true will preserve phrase queries... -- View

Re: Turn off suggester

2014-09-25 Thread Alexandre Rafalovitch
Isn't it one of the Solr components? Can it be just removed from the default chain? Random poking in the dark here. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community:

Re: Turn off suggester

2014-09-25 Thread Tomás Fernández Löbbe
The SuggestComponent is not in the default components list. There must be a request handler with this component added explicitly in the solrconfig.xml Tomás On Thu, Sep 25, 2014 at 12:22 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Isn't it one of the Solr components? Can it be just

Re: Help needed in Indexing and Search on xml content

2014-09-25 Thread Alexandre Rafalovitch
Have a look at data import handler and you'll need to use nested entities. That should get you at least a demo. Then you can decide whether that's good enough. Regards, Alex On 25/09/2014 3:51 am, sangeetha.subraman...@gtnexus.com sangeetha.subraman...@gtnexus.com wrote: Hi Team, I am a

Re: Why does the q parameter change?

2014-09-25 Thread eShard
No, apparently it's the KStemFilter. should I turn this off at query time? I'll put this in another question... -- View this message in context: http://lucene.472066.n3.nabble.com/Why-does-the-q-parameter-change-tp4161179p4161199.html Sent from the Solr - User mailing list archive at

Best practice for KStemFilter query or index or both?

2014-09-25 Thread eShard
Good afternoon, Here's my configuration for a text field. I have the same configuration for index and query time. Is this valid? What's the best practice for these query or index or both? for synonyms; I've read conflicting reports on when to use it but I'm currently changing it over to at

RE: Best practice for KStemFilter query or index or both?

2014-09-25 Thread Markus Jelsma
Hi - most filters should be used both sides, especially stemmers, accent foldings and obviously lowercasing. Synonyms only on one side, depending on how you want to utilize them. Markus -Original message- From:eShard zim...@yahoo.com Sent: Thursday 25th September 2014 22:23 To:

Re: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term

2014-09-25 Thread Diego Fernandez
The difference comes in the fact that when you query the same form it matches 2 tokens including the less common one. When you query a different form you only match on the more common form. So really you're getting the boost from both the tiny difference in TF*IDF and the extra token that you

Re: point buffer returned as an elipse, how to configure?

2014-09-25 Thread david.w.smi...@gmail.com
Hi Mark, I asked a follow-up question/observation to your Stackoverflow instantiation of your question. I also wrote the following, which doesn’t yet fit into an answer because I don’t know what problem you are yet experiencing: Some technical details: geo=true|false is an attribute on the

AW: /suggest through SolrJ?

2014-09-25 Thread Clemens Wyss DEV
Thx to you two. Just in case anybody else is trying to do this. The following SolrJ code corresponds to the http request GET http://localhost:8983/solr/solrpedia/suggest?q=atmo of Solr in Action (chapter 10): ... SolrServer server = new HttpSolrServer(http://localhost:8983/solr/solrpedia;);