Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Yonik Seeley
On Fri, Nov 6, 2015 at 3:12 PM, Jack Krupansky wrote: > Just to be clear, I was suggesting that the filter query (fq) was slow That's a possibility. Filters were actually removed in Lucene, so it's a very different code path now. In 4.10, filters were first class, and

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Yonik Seeley
On Wed, Nov 4, 2015 at 3:36 PM, Shawn Heisey wrote: > The specific index update that fails during the optimize is the SolrJ > deleteByQuery call. deleteByQuery may be the outlier here... we have to jump through extra hoops internally because we don't know which documents it

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Yonik Seeley
On Fri, Nov 6, 2015 at 9:30 PM, wei wrote: > in solr 5.3.1, there is actually a boost, and the score is product of boost > & queryNorm. Hmmm, well, it's worth putting on the list of stuff to investigate. Boosting was also changed in lucene. What happens if you try this

Re: data import extremely slow

2015-11-06 Thread Yangrui Guo
Thanks for the reply. I just removed CacheKeyLookUp and CachedKey and used WHERE clause instead. Everything works fine now. Yangrui On Friday, November 6, 2015, Shawn Heisey wrote: > On 11/6/2015 10:32 AM, Yangrui Guo wrote: > > > There's a good chance that JDBC is trying

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
Hi Jack, I also run the test with queries that have query terms(with filter too). Solr5 is faster compare to solr4 in the test. I got the queries set from our production log, almost all of our queries have filter. So that suggest to me that it is not the filter query that is slow. I copy the fq

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Walter Underwood
It is pretty handy, though. Great for expunging docs that are marked deleted or are expired. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Nov 6, 2015, at 5:31 PM, Alexandre Rafalovitch wrote: > > Elasticsearch removed

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Yonik Seeley
On Fri, Nov 6, 2015 at 9:56 PM, wei wrote: > Good point! I tried that, on solr5 the query time is around 100-110ms, and > on solr4 it is around 60-63ms(very consistent). Solr5 is slower. When it's something easy, there comes a point when it makes sense to stop asking more

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
Hi Shawn, I took care of the warm up problem during the test. I setup jmeter project, get query log from our production(>10 queries), and run the same query log through jmeter to hit the solr instances with the same qps(about 40). I removed warmup queries in both the solr setup, and also set

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
Good point! I tried that, on solr5 the query time is around 100-110ms, and on solr4 it is around 60-63ms(very consistent). Solr5 is slower. Thanks, Wei On Fri, Nov 6, 2015 at 6:46 PM, Yonik Seeley wrote: > On Fri, Nov 6, 2015 at 9:30 PM, wei wrote: > > in

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Alexandre Rafalovitch
Elasticsearch removed deleteByQuery from the core all together. Definitely an outlier :-) Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 6 November 2015 at 20:18, Yonik Seeley wrote: > On Wed, Nov 4, 2015 at 3:36 PM, Shawn

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
the explain part are different in solr4.7 and solr 5.3.1. In solr 4.7, there is only one line 1.0 = (MATCH) MatchAllDocsQuery, product of: 1.0 = queryNorm 1.0 = (MATCH) MatchAllDocsQuery, product of: 1.0 = queryNorm in solr 5.3.1, there is actually a boost, and

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Yonik Seeley
On Fri, Nov 6, 2015 at 10:20 PM, Shawn Heisey wrote: > Is there a decent API for getting uniqueKey? Not off the top of my head. I deeply regret making it configurable and not just using "id" ;-) -Yonik

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Ishan Chattopadhyaya
On Sat, Nov 7, 2015 at 9:09 AM, Yonik Seeley wrote: > On Fri, Nov 6, 2015 at 10:20 PM, Shawn Heisey wrote: > > Is there a decent API for getting uniqueKey? > > Not off the top of my head. > I deeply regret making it configurable and not just using "id"

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Shawn Heisey
On 11/6/2015 6:18 PM, Yonik Seeley wrote: > On Wed, Nov 4, 2015 at 3:36 PM, Shawn Heisey wrote: >> The specific index update that fails during the optimize is the SolrJ >> deleteByQuery call. > > deleteByQuery may be the outlier here... we have to jump through extra > hoops

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
Thanks Yonik. A JIRA bug is opened: https://issues.apache.org/jira/browse/SOLR-8251 Wei On Fri, Nov 6, 2015 at 7:10 PM, Yonik Seeley wrote: > On Fri, Nov 6, 2015 at 9:56 PM, wei wrote: > > Good point! I tried that, on solr5 the query time is around

solr-8983-console.log is huge

2015-11-06 Thread CrazyDiamond
That log file is constantly growing. And it is now ~60GB. what can i change to fix this? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-8983-console-log-is-huge-tp4238613.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr-8983-console.log is huge

2015-11-06 Thread sara hajili
You can change solr loglevel.bydefault solr logs for every thing. You can change this by go in solrconsole.inlog/level and edit levels for just error for example. And this is temporary way. You can also change solrconfig.insolr_home In /log and change logging4j Config. For more info look at:

Adding SanderC to the ContributorsGroup

2015-11-06 Thread Sander Clompen
Hi, Could you please add me to the ContributorsGroup (username: SanderC)? I would like to participate and contribute on the wiki page, I would like to translate the wiki to Dutch (or French, German). Kind regards, SanderC

Re: solr-8983-console.log is huge

2015-11-06 Thread davidphilip cherian
>From mail archives https://support.lucidworks.com/hc/en-us/articles/207072137-Solr-5-X-Console-Logging-solr-8983-console-log On Fri, Nov 6, 2015 at

Solr results relevancy / scoring

2015-11-06 Thread Brian Narsi
I have a situation where. User search query q=15% Solr results contain several documents that are 15% 15% 15% 15% 15 (why?) 15% 15% I have debugged the query and can see that the score for 15 is higher than the ones below it. Why is that? Where can I read in detail about how the scoring is

Re: Solr results relevancy / scoring

2015-11-06 Thread Erick Erickson
I'm not sure what the question your asking is. You say that you have debugged the query and the score for 15 is higher than the ones below it. What's surprising about that? Are you saying you don't understand how the score is calculated? Or the output when adding =true is inconsistent or what?

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

2015-11-06 Thread Alessandro Benedetti
Hi Vincenzo, according to our discoveries I would say the CloudSolrClient to be the most efficient way to interact with a Solr Cloud cluster. ConcurrentUpdateSolrServer will be efficient for a single Solr instance, but using under the hood the XML Response Writer. Even if you prefer to use the

Re: Trying to apply patch for SOLR-7036

2015-11-06 Thread Shawn Heisey
On 11/5/2015 7:04 PM, r b wrote: > I just wanted to double check that my steps were not too off base. > > I am trying to apply the patch from 8/May/15 and it seems to be > slightly off. Inside the working revision is 1658487 so I checked that > out from svn. This is what I did. > > svn

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

2015-11-06 Thread Shawn Heisey
On 11/6/2015 7:15 AM, Vincenzo D'Amore wrote: > I have followed your same path, having a look at java source. I inherited > an installation with CloudSolrServer (I still had solrcloud 4.8) but I was > not sure it was the right choice instead of the (apparently) more appealing >

Re: solr-8983-console.log is huge

2015-11-06 Thread Shawn Heisey
On 11/6/2015 6:17 AM, Upayavira wrote: > On Fri, Nov 6, 2015, at 10:12 AM, sara hajili wrote: >> You can change solr loglevel.bydefault solr logs for every thing. >> You can change this by go in solrconsole.inlog/level and edit levels for >> just error for example. >> And this is temporary way. >>

Re: Securing field level access permission by filtering the query itself

2015-11-06 Thread Douglas McGilvray
You know what guys, I have had a change in perspective… I previously thought: do I want to index all these documents multiple times just to protect 3 fields I am now thinking: do I really want to try to parse all the fields in a query when there are only 3 roles. I have only 4k documents and

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Shawn Heisey
On 11/5/2015 10:25 PM, Jack Krupansky wrote: > I vaguely recall some discussion concerning removal of the field cache in > Lucene. The FieldCache wasn't exactly *removed* ... it's more like it was renamed, improved, and sort of hidden in a miscellaneous package. Some things still require this

Re: Adding SanderC to the ContributorsGroup

2015-11-06 Thread Shawn Heisey
On 11/6/2015 2:58 AM, Sander Clompen wrote: > Could you please add me to the ContributorsGroup (username: SanderC)? > > I would like to participate and contribute on the wiki page, I would like to > translate the wiki to Dutch (or French, German). Done.

Re: Securing field level access permission by filtering the query itself

2015-11-06 Thread Alessandro Benedetti
Are you basically saying that you are going to model 3 collections, 1 per role . Each collection schema will contain only the sensitive field. When you query you simply search in the related collection and retrieve all the fields. that's it ? Cheers On 6 November 2015 at 15:05, Douglas McGilvray

RE: tikaparser docx file fails with exception

2015-11-06 Thread Allison, Timothy B.
Agree with all below, and don't hesitate to open a ticket on Tika's Jira and/or POI's bugzilla...especially if you can share the triggering document. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Thursday, November 05, 2015 6:05 PM To: solr-user

Re: solr-8983-console.log is huge

2015-11-06 Thread Erick Erickson
How do you start solr? If you pipe console output to a file it'll grow forever. Either pipe the output to dev/null or follow Sara's link and take the CONSOLE appender out of log4j.properties Best, Erick On Fri, Nov 6, 2015 at 2:12 AM, sara hajili wrote: > You can change

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

2015-11-06 Thread Erick Erickson
And the other large benefit of CloudSolrClient is that it routes documents directly to the correct leader, i.e. does the routing on the client rather than have the Solr instances forward docs to the routing. Using CloudSolrClient should scale more nearly linearly with increasing shards. Best,

Re: No live SolrServers available to handle this request

2015-11-06 Thread Erick Erickson
The host may be running well, but my bet is that you have an error in the schema.xml file so it's no longer valid XML and the core did not load. So while the solr instance is up and running, no core using that schema is running, thus no live servers. Look at the admin UI, cloud>>graph view and

Re: solr-8983-console.log is huge

2015-11-06 Thread Upayavira
Erick, bin/start pipes stdout to solr-$PORT-console.log or such. With no rotation. So we are setting people up to fail right from the get-go. That's what I'm hoping the attached ticket will resolve. Upayavira On Fri, Nov 6, 2015, at 03:52 PM, Erick Erickson wrote: > How do you start solr? If

Re: solr-8983-console.log is huge

2015-11-06 Thread Alexandre Rafalovitch
What about the Garbage Collection output? I think we have the same issue there. Frankly, I don't know how many people know what to do with that in a first place. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 6 November 2015 at 11:11,

Boost query at search time according set of roles with least performance impact

2015-11-06 Thread Andrea Roggerone
Hi all, I am working on a mechanism that applies additional boosts to documents according to the role covered by the author. For instance we have CEO|5 Architect|3 Developer|1 TeamLeader|2 keeping in mind that an author could cover multiple roles (e.g. for a design document, a Team Leader could

Re: Child document and parent document with same key

2015-11-06 Thread Jamie Johnson
Thanks that's what I suspected given what I'm seeing but wanted to make sure. Again thanks On Nov 5, 2015 1:08 PM, "Mikhail Khludnev" wrote: > On Fri, Oct 16, 2015 at 10:41 PM, Jamie Johnson wrote: > > > Is this expected to work? > > > I think it

Re: solr-8983-console.log is huge

2015-11-06 Thread Upayavira
On Fri, Nov 6, 2015, at 10:12 AM, sara hajili wrote: > You can change solr loglevel.bydefault solr logs for every thing. > You can change this by go in solrconsole.inlog/level and edit levels for > just error for example. > And this is temporary way. > You can also change solrconfig.insolr_home

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

2015-11-06 Thread Vincenzo D'Amore
Hi Alessandro, I have followed your same path, having a look at java source. I inherited an installation with CloudSolrServer (I still had solrcloud 4.8) but I was not sure it was the right choice instead of the (apparently) more appealing ConcurrentUpdateSolrClient. As far as I understood,

data import extremely slow

2015-11-06 Thread Yangrui Guo
Hi I'm using Solr's data import handler and MySQL 5.5 to index imdb database. However the data-import takes a few minutes to process one document while there are over 3 million movies. This is going to take forever yet I can select the rows in MySQL in no time. Where am I doing wrong? My

Re: solr-8983-console.log is huge

2015-11-06 Thread Erick Erickson
Yep, I looked at the new JIRA and finally figured out what the problem is. It should be changed, but in the meantime one can go in and take the CONSOLE appender out of the logging properties file. Or restart Solr periodically. Ugly but it would work. On Fri, Nov 6, 2015 at 8:13 AM, Alexandre

Re: solr-8983-console.log is huge

2015-11-06 Thread Shawn Heisey
On 11/6/2015 9:13 AM, Alexandre Rafalovitch wrote: > What about the Garbage Collection output? I think we have the same > issue there. Frankly, I don't know how many people know what to do > with that in a first place. Turns out that Java has rotation capability built in to GC logging:

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Jack Krupansky
Just to be clear, I was suggesting that the filter query (fq) was slow, not the MatchAllDocsQuery, which should be just as speedy as before. You can test for yourself whether the MADQ by itself is any slower. You could also test using the fq as the main query (q) - with no fq parameter, and see

Re: Solr results relevancy / scoring

2015-11-06 Thread Doug Turnbull
You might paste your URL into http://splainer.io and it will explain your results ranking to you in a perhaps more helpful way -Doug On Fri, Nov 6, 2015 at 2:04 PM, Brian Narsi wrote: > I have a situation where. > > User search query > > q=15% > > Solr results contain

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread wei
Thanks Jack and Shawn. I checked these Jira tickets, but I am not sure if the slowness of MatchAllDocsQuery is also caused by the removal of fieldcache. Can someone please explain a little bit? Thanks, Wei On Fri, Nov 6, 2015 at 7:15 AM, Shawn Heisey wrote: > On 11/5/2015

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Shawn Heisey
On 11/6/2015 2:23 PM, Pushkar Raste wrote: > I may be wrong but I think 'delete' and 'optimize' can not be executed > concurrently on a Lucene index It certainly is looking that way. After discussing it with Hoss on IRC, I tried a manual test where I started an optimize and then did some "add"

Re: data import extremely slow

2015-11-06 Thread Shawn Heisey
On 11/6/2015 10:32 AM, Yangrui Guo wrote: >

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-06 Thread Shawn Heisey
On 11/6/2015 1:01 PM, wei wrote: > Thanks Jack and Shawn. I checked these Jira tickets, but I am not sure if > the slowness of MatchAllDocsQuery is also caused by the removal of > fieldcache. Can someone please explain a little bit? I only glanced at your full output in the message at the start

Re: Is it impossible to update an index that is undergoing an optimize?

2015-11-06 Thread Pushkar Raste
I may be wrong but I think 'delete' and 'optimize' can not be executed concurrently on a Lucene index On 4 November 2015 at 15:36, Shawn Heisey wrote: > On 11/4/2015 1:17 PM, Yonik Seeley wrote: > > On Wed, Nov 4, 2015 at 3:06 PM, Shawn Heisey > wrote:

Re: Trying to apply patch for SOLR-7036

2015-11-06 Thread r b
Ah, thanks for that. The 4.10 branch was it. If I have time, I'll study up on what this patch is doing and see if I can't port it to 5x. On Fri, Nov 6, 2015 at 6:24 AM, Shawn Heisey wrote: > On 11/5/2015 7:04 PM, r b wrote: >> I just wanted to double check that my steps were