Selective field query

2015-10-09 Thread Colin Hunter
Hi I am working on a complex search utility with an index created via data import from an extensive MySQL database. There are many ways in which the index is searched. One of the utility input fields searches only on a Service Name. However, if I target the query as q=ServiceName:"Searched

Re: Selective field query

2015-10-09 Thread Upayavira
On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote: > Hi > > I am working on a complex search utility with an index created via data > import from an extensive MySQL database. > There are many ways in which the index is searched. One of the utility > input fields searches only on a Service

Re: Selective field query

2015-10-09 Thread Colin Hunter
Ah ha... the copy field... makes sense. Thank You. On Fri, Oct 9, 2015 at 10:04 AM, Upayavira wrote: > > > On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote: > > Hi > > > > I am working on a complex search utility with an index created via data > > import from an

java.util.EmptyStackException during SPLITSHARD

2015-10-09 Thread Oliver Schrenk
Hi, trying to experiment with overcharging on our Solr 4.7.2 cluster and called SPLITSHARD command which after ~30 minutes of work failed with curl "http://solrhost:1234/solr/admin/collections?collection=acme=shard1=SPLITSHARD;

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon
Hi, I tried to use the same as mentioned in the url . And I used the description field to check because mapping field is multivalued. So I add the fq={!frange%20l=0%20u=1}strdist(title,description,edit)

Re: Solr Pagination

2015-10-09 Thread Shawn Heisey
On 10/9/2015 1:39 PM, Salman Ansari wrote: > INFO - 2015-10-09 18:46:17.953; [c:sabr102 s:shard1 r:core_node2 > x:sabr102_shard1_replica1] org.apache.solr.core.SolrCore; > [sabr102_shard1_replica1] webapp=/solr path=/select > params={start=0=(content_text:Football)=10} hits=24408 status=0 >

Re: Query Keyword Storage

2015-10-09 Thread Erik Hatcher
There’s no built-in query log handling, other than the (jetty) request logs. More and more these days, folks are logging directly or processing log files back into Solr, in a separate collection, and driving analytics from that. You can do a lot with logstash + banana

RE: which one is faster synonym_edismax & edismax faster?

2015-10-09 Thread Markus Jelsma
Hi - if you run a CPU sampler or profiler you will probably see it doesn't matter. Markus -Original message- > From:Aman Tandon > Sent: Friday 9th October 2015 6:52 > To: solr-user@lucene.apache.org > Subject: which one is faster synonym_edismax edismax

Re: Highlighting tag is not showing occasionally

2015-10-09 Thread Zheng Lin Edwin Yeo
I found that it could be due to the EdgeNGramFilterFactory. This issue didn't happen if I did not apply the EdgeNGramFilterFactory filter for my fieldType. But does anyone knows why using the EdgeNGramFilterFactory will cause this problem? Regards, Edwin On 7 October 2015 at 17:46, Zheng Lin

Re: Best Indexing Approaches - To max the throughput

2015-10-09 Thread Alessandro Benedetti
For doing what ? We were talking for best approaches for both the single server infrastructure or cloud one. Cheers On 8 October 2015 at 19:45, Susheel Kumar wrote: > The ConcurrentUpdateSolrClient is not cloud aware or takes zkHostString as > input. So only option is

Query Keyword Storage

2015-10-09 Thread Imtiaz Shakil Siddique
Hi, I'd like to know is there any built-in feature/plugin in solr that can store user query . I know that I can always check the jetty server's log files which ships with solr for collecting user query. But is there any other better way? And If I needed to write a plugin for this case, what

Re: Selective field query

2015-10-09 Thread Erick Erickson
Colin: Adding =all to your query is your friend here, the parsed_query.toString will show you exactly what is searched against. Best, Erick On Fri, Oct 9, 2015 at 2:09 AM, Colin Hunter wrote: > Ah ha... the copy field... makes sense. > Thank You. > > On Fri, Oct 9,

How do I set up custom collection cores?

2015-10-09 Thread espeake
We are installing Alfresco One 5.0.1 with solr4 on a server that has an existing instance of tomcat7. I am trying to find some better documentation on how to setup our cores. In the solr4.xml located at /etc/tomcat7/Catalina/localhost has this inside of it. Then at

Re: [SolrJ] Indexing Java Map into Solr

2015-10-09 Thread Erick Erickson
Hmmm, what does the code look like for Java? One of the cardinal sins of indexing with SolrJ is sending docs one at a time rather than as batches of at least 100 (I usually use 1,000). See: https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/ One technique I often use to chase this

[SolrJ] Indexing Java Map into Solr

2015-10-09 Thread Alessandro Benedetti
Hi guys, I was evaluating an Indexer application. This application takes in input a Collection of Objects that are basically Java Maps. This is for covering Solr side a big group of dynamic fields basically and avoid that complexity java side. Let's go to the point, currently the indexing

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
bq: 10GB JVM as mentioned here...and they were getting 140 ms response time for 10 Billion documents This simply could _not_ work in a single shard as there's a hard 2B doc limit per shard. On slide 14 it states "both collections are sharded". They are not fitting 10B docs in 10G of JVM on a

Re: Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-09 Thread Eric Torti
Ok, thanks Shawn! That makes sense. We'll be experimenting with it. Best, Eric On Wed, Oct 7, 2015 at 5:54 PM, Shawn Heisey wrote: > On 10/7/2015 12:00 PM, Eric Torti wrote: >> Can we read "high reopen rate" as "frequent soft commits"? (In our >> case, hard commits do not

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
Thanks Eric for your response. If you find pagination is not the main culprit, what other factors do you guys suggest I need to tweak to test that? As I mentioned, by navigating to 2 results using start and row I am getting time out from Solr.NET and I need a way to fix that. You suggested

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon
okay Thanks With Regards Aman Tandon On Fri, Oct 9, 2015 at 4:25 PM, Upayavira wrote: > Just beware of performance here. This is fine for smaller indexes, but > for larger ones won't work so well. It will need to do this calculation > for every document in your index, thereby

Solr Pagination

2015-10-09 Thread Salman Ansari
Hi guys, I have been working with Solr and Solr.NET for some time for a big project that requires around 300M documents. Consequently, I faced an issue and I am highlighting it here in case you have any comments: As mentioned here (

Re: Exclude documents having same data in two fields

2015-10-09 Thread Upayavira
Just beware of performance here. This is fine for smaller indexes, but for larger ones won't work so well. It will need to do this calculation for every document in your index, thereby undoing all benefits of having an inverted index. If your index (or resultset) is small enough, it can work, but

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: [Pagination with cursors] > For example, what happens if the user navigates from page 1 to page 2, > does the front end need to store the next cursor at each query? Yes. > What about going to a previous page, do we need to store all cursors >

OverseerCollectionMessageHandler logging

2015-10-09 Thread Alan Woodward
Hi all, The OverseerCollectionMessageHandler logs all messages that it processes at WARN level, which seems wrong? Particularly as it handles OVERSEERSTATUS messages, which means that monitoring systems can trigger warnings all over the place. Is there a specific reason for this, or should I

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
I agree 10B will not be residing on the same machine :) About the other issue you raised, while submitting the query to Solr I was keeping a close eye on RAM and JVM consumption on Solr Admin and for queries at the beginning that were taking most of the time, neither RAM nor JVM was hitting the

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: > As for the logs, I searched for "Salman" with rows=10 and start=1000 and it > took about 29 seconds to complete. However, it took less at each shard as > shown in the log file > [...] QTime=91 > [...] QTime=4 > the search in the second shard

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
OK, this makes very little sense. The individual queries are taking < 100ms yet the total response is 29 seconds. I do note that one of your queries has rows=1010, a typo? Anyway, not at all sure what's going on here. If these are gigantic files you're returning, then it could be decompressing

Re: How do I set up custom collection cores?

2015-10-09 Thread Shawn Heisey
On 10/9/2015 10:03 AM, espe...@oreillyauto.com wrote: > We are installing Alfresco One 5.0.1 with solr4 on a server that has an > existing instance of tomcat7. I am trying to find some better > documentation on how to setup our cores. In the solr4.xml located > Caused by: java.io.IOException:

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: > Thanks Eric for your response. If you find pagination is not the main > culprit, what other factors do you guys suggest I need to tweak to test > that? Well, is basic search slow? What are your response times for plain un-warmed top-20 searches?

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon
Thanks Mikhail the suggestion. I will try that on monday will let you know. *@*Walter This was just an random requirement to find those fields which are not same and then reindex only those. I can full index but I was wondering if there might some function or something. With Regards Aman Tandon

Re: Exclude documents having same data in two fields

2015-10-09 Thread Susheel Kumar
Hi Aman, Did the problem resolved or still having some errors. Thnx On Fri, Oct 9, 2015 at 8:28 AM, Aman Tandon wrote: > okay Thanks > > With Regards > Aman Tandon > > On Fri, Oct 9, 2015 at 4:25 PM, Upayavira wrote: > > > Just beware of performance

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
> Thanks Eric for your response. If you find pagination is not the main > culprit, what other factors do you guys suggest I need to tweak to test > that? Well, is basic search slow? What are your response times for plain un-warmed top-20 searches? I have restarted Solr and I have tried running a

Re: how to deployed another web project into jetty server(solr inbuilt)

2015-10-09 Thread Mugeesh Husain
Thank you Upayavira Clearly understand, now they agree to install another server. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-deployed-another-web-project-into-jetty-server-solr-inbuilt-tp4233288p4233733.html Sent from the Solr - User mailing list archive

Re: OverseerCollectionMessageHandler logging

2015-10-09 Thread Shalin Shekhar Mangar
Yes, that should be INFO On Fri, Oct 9, 2015 at 8:02 PM, Alan Woodward wrote: > Hi all, > > The OverseerCollectionMessageHandler logs all messages that it processes at > WARN level, which seems wrong? Particularly as it handles OVERSEERSTATUS > messages, which means that

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon
No Susheel, As our index size is 62 GB so it seems hard to find those records. With Regards Aman Tandon On Fri, Oct 9, 2015 at 7:30 PM, Susheel Kumar wrote: > Hi Aman, Did the problem resolved or still having some errors. > > Thnx > > On Fri, Oct 9, 2015 at 8:28 AM,

schema.xml field configuration

2015-10-09 Thread Vincenzo D'Amore
Hi, I have this fieldType configuration: Using Solr Field Analysis tool for the string "aaa", in the last step at end I see this: text | aaa | | aaa | aaa position | 1 | 1| 1 | 2 start| 0 | 0| 0 | 4 end | 8 | 4|

Re: Exclude documents having same data in two fields

2015-10-09 Thread Walter Underwood
Please explain why you do not want to use an extra field. That is the only solution that will perform well on your large index. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 9, 2015, at 7:47 AM, Aman Tandon wrote: >

SolrCloud NoAuth for /unrelatednode error

2015-10-09 Thread Jamie Johnson
I am getting an error that essentially says solr does not have auth for /unrelatednode/... I would be ok with the error being displayed, but I think this may be what is causing my solr instances to be shown as down. Currently I'm issuing the following command

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
Is this a real problem or a worry? Do you have users that page really deep and if so, have you considered other mechanisms for delivering what they need? The issue is that currently I have around 70M documents and some generic queries are resulting in lots of pages. Now if I try deep navigation

Re: schema.xml field configuration

2015-10-09 Thread Erick Erickson
Seems odd to me as well. I suspect you can work around this by either setting catenateall="0" or perserveOriginal="0" Best, Erick On Fri, Oct 9, 2015 at 7:50 AM, Vincenzo D'Amore wrote: > Hi, > > I have this fieldType configuration: > > positionIncrementGap="100"> > > >

Re: OverseerCollectionMessageHandler logging

2015-10-09 Thread Alan Woodward
I'll raise a Jira, thanks Shalin. Alan Woodward www.flax.co.uk On 9 Oct 2015, at 16:05, Shalin Shekhar Mangar wrote: > Yes, that should be INFO > > On Fri, Oct 9, 2015 at 8:02 PM, Alan Woodward wrote: >> Hi all, >> >> The OverseerCollectionMessageHandler logs all messages

Re: Exclude documents having same data in two fields

2015-10-09 Thread Mikhail Khludnev
Aman, You can invoke Terms Component for the filed M, let it returns terms: {a,c,d,f} then you invoke it for field T let it return {b,c,f,e}, then you intersect both lists (it's quite romantic if they are kept ordered), you've got {c,f} and then you applies filter: fq=-((+M:c +T:c) (+M:f +T:f))

Re: SolrCloud NoAuth for /unrelatednode error

2015-10-09 Thread Jamie Johnson
Ah please ignore, it looks like this was totally unrelated and my issue was configuration related On Fri, Oct 9, 2015 at 11:18 AM, Jamie Johnson wrote: > I am getting an error that essentially says solr does not have auth for > /unrelatednode/... I would be ok with the error

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
I think paging is something of a red herring. You say: bq: but still I get delays of around 16 seconds and sometimes even more. Even for a start of 1,000, this is ridiculously long for Solr. All you're really saving here is keeping a record of the id and score for a list 1,000 cells long (or