Re: Get this committed

2015-10-23 Thread William Bell
OK I added the test case. On Fri, Oct 23, 2015 at 5:05 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > I can review and commit if you add a test. > > On Fri, Oct 23, 2015 at 9:45 AM, William Bell wrote: > > I can confirm this is working in PROD at 100M hits a day. > > > > Can we com

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Ray Niu
I would also suggest to use Solr5, as there are lots of new features. We are using 5.2.1 now, which is pretty stable. 2015-10-23 16:32 GMT-07:00 Shawn Heisey : > On 10/23/2015 12:22 PM, Robert Hume wrote: > > I'm investigating installing a new Solr deployment to be able to search > > about two mi

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Shawn Heisey
On 10/23/2015 12:22 PM, Robert Hume wrote: > I'm investigating installing a new Solr deployment to be able to search > about two million documents (mostly HTML and PDF). > > QUESTIONS: > > A. Should I use Solr 4.x or 5.x? My concerns are mostly to do with > support. Is 5.x too new to be able to g

Possible bug with searchers and core swapping

2015-10-23 Thread Shawn Heisey
Today I noticed this on my dev server running Solr 5.2.1: https://www.dropbox.com/s/bt81sv35acb7q2n/searcher-from-old-corename.png?dl=0 The name of this core is spark0live, but before the last index rebuild, it was named spark0build. My full rebuild process indexes data into build cores, then wh

Re: Does docValues impact termfreq ?

2015-10-23 Thread Jack Krupansky
Term frequency applies only to the indexed terms of a tokenized field. DocValues is really just a copy of the original source text and is not tokenized into terms. Maybe you could explain how exactly you are using term frequency in function queries. More importantly, what is so "heavy" about your

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Upayavira
To add: 5.x is NOT a hugely different thing from 4.x. The version update was because of Lucene index versioning issues, nothing to do with core functionality within Solr. So, there really is no reason to hold back from using a 5.x release (any more than there is from holding back from using any la

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Alexandre Rafalovitch
http://www.amazon.com/gp/product/B00D85K9XC (kindle version) I love the mixed reviews on that edition :-) https://www.packtpub.com/big-data-and-business-intelligence/instant-apache-solr-indexing-data-how-instant (digital version from publisher). If you join my Solr Resources mailing list at: http:

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Robert Hume
Hi Alex, What's the title of your book? An amazon link would be useful too. Thanks! Rob On Fri, Oct 23, 2015 at 2:50 PM, Alexandre Rafalovitch wrote: > Definitely 5.x. Lots of new goodies. It is true that some of the > startup scripts are different and the example schemas could be > slightly

Re: Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Alexandre Rafalovitch
Definitely 5.x. Lots of new goodies. It is true that some of the startup scripts are different and the example schemas could be slightly confusing if following a book, but I think it is well worth starting on a good foot. Just remember, no "collection1" anymore, all cores/collections are explicit.

Should I install 4.x or 5.x? Book recommendations?

2015-10-23 Thread Robert Hume
Hi, I'm investigating installing a new Solr deployment to be able to search about two million documents (mostly HTML and PDF). QUESTIONS: A. Should I use Solr 4.x or 5.x? My concerns are mostly to do with support. Is 5.x too new to be able to get good answers and advice from the community? Or

RE: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Markus Jelsma
Sure: https://issues.apache.org/jira/browse/SOLR-8194 M. -Original message- > From:Alan Woodward > Sent: Friday 23rd October 2015 18:17 > To: solr-user@lucene.apache.org > Subject: Re: NPE in CloudSolrClient via AbstractFullDistribZkTestBase > > No worries :-) Actually it would proba

Re: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Alan Woodward
No worries :-) Actually it would probably be worth improving the error reporting here to throw NPE when the documents are added to the UpdateRequest in the first place - do you want to open a JIRA? Alan Woodward www.flax.co.uk On 23 Oct 2015, at 17:00, Markus Jelsma wrote: > Ah crap, indeed!

RE: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Markus Jelsma
Ah crap, indeed! A few items slipped through some checks that i thought were correct. Sorry to have bothered the list with this nonsense, but i didn't 'see' it anymore :P Thanks! Markus -Original message- > From:Alan Woodward > Sent: Friday 23rd October 2015 17:30 > To: solr-user@l

solrj UpdateRequest Delete ReplicationFactor

2015-10-23 Thread Troy Collinsworth
- Why doesn't a solrj UpdateRequest delete return any shard replication factor data? - Is there a way to know if/when a solrj UpdateRequest delete has achieved replication factor > 1? When executing a UpdateRequest deleteByQuery with route, the minAchievedReplicationFactor is always

Re: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Alan Woodward
It looks as though you're adding a null SolrInputDocument to your UpdateRequest somehow? The bit that's throwing a NPE is iterating through the documents in order to route things correctly (UpdateRequest.java:204). Alan Woodward www.flax.co.uk On 23 Oct 2015, at 13:53, Markus Jelsma wrote: >

Re: missing in json facet does not work for stream?

2015-10-23 Thread Shalin Shekhar Mangar
I see, thanks! On Fri, Oct 23, 2015 at 8:08 PM, Yonik Seeley wrote: > On Fri, Oct 23, 2015 at 10:24 AM, Shalin Shekhar Mangar > wrote: >> Now I am curious, what does it do! > > It's basically like facet.method=enum, but it truly streams > (calculates each facet bucket on-the-fly and writes it to

Lucene/Solr Git Mirrors 5 day lag behind SVN?

2015-10-23 Thread Kevin Risden
It looks like both Apache Git mirror (git://git.apache.org/lucene-solr.git) and GitHub mirror (https://github.com/apache/lucene-solr.git) are 5 days behind SVN. This seems to have happened before: https://issues.apache.org/jira/browse/INFRA-9182 Is this a known issue? Kevin Risden

Re: Solr fails to start with log file not found error

2015-10-23 Thread awhosit
Thanks for the reply, Shawn. But it seems to give me only 1.7. #sudo rpm -qa | egrep "(java|jdk)" java-1_7_0-openjdk-1.7.0.85-18.2.x86_64 libjavascriptcoregtk-3_0-0-2.4.8-16.2.x86_64 java-1_7_0-openjdk-headless-1.7.0.85-18.2.x86_64 timezone-java-2015g-0.26.1.noarch # update-alternatives --config

Re: missing in json facet does not work for stream?

2015-10-23 Thread Yonik Seeley
On Fri, Oct 23, 2015 at 10:24 AM, Shalin Shekhar Mangar wrote: > Now I am curious, what does it do! It's basically like facet.method=enum, but it truly streams (calculates each facet bucket on-the-fly and writes it to the response). Since it is streaming, it only supports sorting by term index or

Re: missing in json facet does not work for stream?

2015-10-23 Thread Shalin Shekhar Mangar
Now I am curious, what does it do! On Fri, Oct 23, 2015 at 7:40 PM, Yonik Seeley wrote: > On Fri, Oct 23, 2015 at 5:55 AM, hao jin wrote: >> Hi >> I found when the method of json facet is set to stream, the "missing" is not >> added to the result. >> Is it designed or a known issue? > > You foun

Re: locks and high CPU

2015-10-23 Thread Erick Erickson
Thanks Shalin, I'd forgotten about that one On Thu, Oct 22, 2015 at 11:22 PM, Shalin Shekhar Mangar wrote: > I think you running into > https://issues.apache.org/jira/browse/SOLR-6136 where a spin lock in > ConcurrentUpdateSolrServer blocks with high cpu usage. Try upgrading > to 4.10.x but f

Re: Is it possible to specigfy only one-character term synonymfor2-gram tokenizer?

2015-10-23 Thread Erick Erickson
Scott: The Apache spam filters are quite aggressive and sometimes reject e-mails that are formatted any way other than "plain text" so that may have been what happened to your e-mails. Best, Erick On Fri, Oct 23, 2015 at 3:23 AM, Emir Arnautovic wrote: > Hi Scott, > This replacement will only b

Re: missing in json facet does not work for stream?

2015-10-23 Thread Yonik Seeley
On Fri, Oct 23, 2015 at 5:55 AM, hao jin wrote: > Hi > I found when the method of json facet is set to stream, the "missing" is not > added to the result. > Is it designed or a known issue? You found an undocumented feature (method=stream) ;-) That facet method doesn't have adequate testing yet,

RE: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Markus Jelsma
Ah yes, i think i overlooked that one. Here it is: org.apache.solr.client.solrj.SolrServerException: java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([C5A84EC72B29125E:BA7A28521E031EEB]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRet

Does docValues impact termfreq ?

2015-10-23 Thread Aki Balogh
Hello, In our solr application, we use a Function Query (termfreq) very heavily. Index time and disk space are not important, but we're looking to improve performance on termfreq at query time. I've been reading up on docValues. Would this be a way to improve performance? I had read that Lucene

Re: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Alan Woodward
The NPE is from another server (hence being wrapped in a SolrServerException), so the original issue *should* be being logged elsewhere - are there no errors earlier on in the log? Alan Woodward www.flax.co.uk On 23 Oct 2015, at 12:44, Markus Jelsma wrote: > Hi - anyone here to shed some lig

RE: NPE in CloudSolrClient via AbstractFullDistribZkTestBase

2015-10-23 Thread Markus Jelsma
Hi - anyone here to shed some light on the issue? Markus -Original message- > From:Markus Jelsma > Sent: Tuesday 20th October 2015 13:39 > To: solr-user > Subject: NPE in CloudSolrClient via AbstractFullDistribZkTestBase > > Hi - we have some code inside a unit test, extending >

Re: Select sibling data via XPathEntityProcessor

2015-10-23 Thread Alexandre Rafalovitch
Sounds like maybe you have an invalid XML fragment with several elements next to each other without common parent. I am starting to think that perhaps you are better off doing a hacky regular-expression at least to get you through your first iteration. Or a custom-coded pre-processor that will do

Re: getting cached terms inside UpdateRequestProcessor...

2015-10-23 Thread Erik Hatcher
Roxana - please share your full configuration (minus passwords of course) so we can all see what the dilemma is. I can’t make sense of what you’re trying and why you’re trying it that way rather than intra-update-script-analysis. Erik > On Oct 23, 2015, at 2:49 AM, Roxana Danger >

RE: Select sibling data via XPathEntityProcessor

2015-10-23 Thread Routley, Alan
Thanks Alex for getting back to me. As per your suggestion I've gone down the xsl root. I've created a transformation that works fine in various test tools, but Solr is throwing errors such as: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.d

Re: Get this committed

2015-10-23 Thread Shalin Shekhar Mangar
I can review and commit if you add a test. On Fri, Oct 23, 2015 at 9:45 AM, William Bell wrote: > I can confirm this is working in PROD at 100M hits a day. > > Can we commit it please? Begging here. > > https://issues.apache.org/jira/browse/SOLR-7993 > > -- > Bill Bell > billnb...@gmail.com > cel

Re: Get this committed

2015-10-23 Thread Alexandre Rafalovitch
Begging at the Dev list is probably more efficient, though I am sure most of them are hanging around here as well. Regards, Alex. P.s. Sorry, I wish I could help. Not a committer. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 23 October 20

Re: Select sibling data via XPathEntityProcessor

2015-10-23 Thread Alexandre Rafalovitch
If you are stuck with DIH, it looks like you can specify xsl attribute to the XPathEntityProcessor and it will be used as a pre-procesor. I would probably use it to convert outer NamedAuthority tag into a corresponding Author or Subject tag. Looks easiest. If you are not sure how to generate good

Re: Is it possible to specigfy only one-character term synonymfor2-gram tokenizer?

2015-10-23 Thread Emir Arnautovic
Hi Scott, This replacement will only be in index terms and not in stored field so you are fine - problem you mention is related to case when you do replacement in raw text. However, this would be part of analysis chain (both index and query) so has no effect on presentation (unless you are us

Re: Solr Full text search

2015-10-23 Thread Upayavira
add debugQuery=true to your query, and look at the parsed query - it'll show you a lot about what's going on. Also, try the phrase "good building constructor" in the admin UI analysis tab, for your full_text field. It'll help you understand what's happening in terms of tokenisation. Upayavira On

missing in json facet does not work for stream?

2015-10-23 Thread hao jin
Hi I found when the method of json facet is set to stream, the "missing" is not added to the result. Is it designed or a known issue? Thanks