date:20140316

[jira] [Updated] (LUCENE-5513) Binary DocValues Updates

2014-03-16 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5513:
---

Attachment: LUCENE-5513.patch

Patch makes the following refactoring changes (all internal API):

* DocValuesUpdate abstract class w/ common implementation for 
NumericDocValuesUpdate and BinaryDocValuesUpdate.

* DocValuesFieldUpdates hold the doc+updates for a single field. It mostly 
defines the API for the Numeric* and Binary* implementations.

* DocValuesFieldUpdates.Container holds numeric+binary updates for a set of 
fields. It is as its name says -- a container of updates used by 
ReaderAndUpdates.
** It helps not bloat the API w/ more maps being passed as well as simplified 
BufferedUpdatesStream and IndexWriter.commitMergedDeletes.
** It also serves as a factory method based on the updates Type

* Finished TestBinaryDVUpdates

* Added TestMixedDVUpdates which ports some of the 'big' tests from both 
TestNDV/BDVUpdates and mixes some NDV and BDV updates.
** I'll beast it some to make sure all edge cases are covered.

I may take a crack at simplifying IW.commitMergedDeletes even more by pulling a 
lot of duplicate code into a method. This is impossible now because those 
sections modify more than one state variables, but I'll try to stuff these 
variables in a container to make this method more sane to read.

Otherwise, I think it's ready.

> Binary DocValues Updates
> 
>
> Key: LUCENE-5513
> URL: https://issues.apache.org/jira/browse/LUCENE-5513
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: core/index
>Reporter: Mikhail Khludnev
>Priority: Minor
> Attachments: LUCENE-5513.patch, LUCENE-5513.patch
>
>
> LUCENE-5189 was a great move toward. I wish to continue. The reason for 
> having this feature is to have "join-index" - to write children docnums into 
> parent's binaryDV. I can try to proceed the implementation, but I'm not so 
> experienced in such deep Lucene internals. [~shaie], any hint to begin with 
> is much appreciated. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5532) AutomatonQuery.hashCode is not thread safe

2014-03-16 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5532:


Attachment: LUCENE-5532.patch

same patch, just with some reordering of things in RunAutomaton.equals for 
faster speed.

> AutomatonQuery.hashCode is not thread safe
> --
>
> Key: LUCENE-5532
> URL: https://issues.apache.org/jira/browse/LUCENE-5532
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-5532.patch, LUCENE-5532.patch
>
>
> This hashCode is implemented based on  #states and #transitions.
> These methods use getNumberedStates() though, which may oversize itself 
> during construction and then "size down" when its done. But numberedStates is 
> prematurely set (before its "ready"), which can cause a hashCode call from 
> another thread to see a corrupt state... causing things like NPEs from null 
> states and other strangeness. I don't think we should set this variable until 
> its "finished".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5532) AutomatonQuery.hashCode is not thread safe

2014-03-16 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5532:


Attachment: LUCENE-5532.patch

Here's a test for the thread safety bug, fails like this:
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.lucene.util.automaton.Automaton.getNumberOfTransitions(Automaton.java:543)
at 
org.apache.lucene.search.AutomatonQuery.hashCode(AutomatonQuery.java:84)
at 
org.apache.lucene.search.TestAutomatonQuery$1.run(TestAutomatonQuery.java:228)
{noformat}

this patch takes a different approach: it doesnt assert "same language" at all, 
thats expensive and I don't think its nearly as important as things like being 
thread-safe, hashcode being consistent with equals, etc.

So we just impl hashcode/equals with the compiled form. If the automata have 
the same structure (e.g. same regex or wildcard), it will return true. The 
previous stuff was overkill anyway, because  e.g. foo* would not equate to 
foo** since the "term" is different!

I also made getNumberedStates a little less trappy, even though its no longer 
used here by this stuff.

> AutomatonQuery.hashCode is not thread safe
> --
>
> Key: LUCENE-5532
> URL: https://issues.apache.org/jira/browse/LUCENE-5532
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-5532.patch
>
>
> This hashCode is implemented based on  #states and #transitions.
> These methods use getNumberedStates() though, which may oversize itself 
> during construction and then "size down" when its done. But numberedStates is 
> prematurely set (before its "ready"), which can cause a hashCode call from 
> another thread to see a corrupt state... causing things like NPEs from null 
> states and other strangeness. I don't think we should set this variable until 
> its "finished".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2508) Consolidate Highlighter implementations and a major refactor of the non-termvector highlighter

2014-03-16 Thread Scott Stults (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937464#comment-13937464
 ] 

Scott Stults commented on LUCENE-2508:
--

This is great! IIRC, one of the todo's in the FVH was to properly integrate it 
with the existing highlighter. One thing I'm wondering is whether this should 
be expanded to take in the postings highlighter, or make that integration a 
follow-on issue. (For one minor example, DefaultPassageFormatter has an HTML 
escape function that can be shared.)


> Consolidate Highlighter implementations and a major refactor of the 
> non-termvector highlighter
> --
>
> Key: LUCENE-2508
> URL: https://issues.apache.org/jira/browse/LUCENE-2508
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/highlighter
> Environment: irrelevant
>Reporter: Edward Drapkin
>Priority: Minor
>  Labels: highlight, search
> Fix For: 4.8
>
> Attachments: LUCENE-2508.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Originally, I had planned to create a contrib module to allow people to 
> highlight multiple documents in parallel, but after talking to Uwe in IRC 
> about it, I realized that it was pretty useless.  However, I was already 
> sitting on an iterative highlighting algorithm that was much faster (my tests 
> show 20% - 40%) and more accurate and, based on that same IRC conversation, I 
> decided to not let all the work that I had done go to waste and try to 
> contribute it back again.  Uwe had mentioned that "More like this" detected 
> term vectors when called and use the term vector implementation when 
> possible, if I recall correctly, so I decided to do that.
> The patch that I've attached is my first stab at this.  It's not nearly 
> complete and full disclosure dictates that I say that it's not fully 
> documented and there are not any unit tests written.  I wanted to go ahead 
> and open an issue to get some feedback on the approach that I've taken as 
> well as the fact that it exists will be a proverbial kick in my pants to 
> continue working on it.
> In short, what I've changed:
> * Completely rewritten the non-tv highlighter to be faster and cleaner.  
> There is some small loss in functionality for now, namely the loss of the 
> GradientHighlighter (I just haven't done this yet) and the lack of exposure 
> of TermFragments and their scores (I can expose this if it is deemed 
> necessary, this is one of the things I'd like feedback on). 
> * Moved org.apache.lucene.search.vectorhighlight and 
> org.apache.lucene.search.highlight to a single package with a unified 
> interface, search.highlight (with two sub-packages: 
> search.highlight.termvector and search.highlight.iterative, respectively).
> * Unified the highlighted term formatting into a single interface: 
> highlighter/Formatter and both highlighters use this now.  
> What I need to do before I personally would consider this finished:
> * Finish documentation, most specifically on TermVectorHighlighter.  I 
> haven't done this now as I expect things to change up quite a bit before 
> they're finalized and I really hate writing documentation that goes to waste, 
> but I do intend to complete this bullet :)
> * "Flesh out" the API of search.highlight.Highlighter as it's very barebones 
> right now
> * Continue removing and consolidating duplicate functionality, like I've done 
> with the highlighted word tag generation.
> What I think I need feedback on, before I can proceed:
> * FastTermVectorHighlighter and the iterative highlighters need completely 
> different sets of information in order to work.  The approach I've taken is 
> exposing a vectorHighlight method in the unified interface and a 
> iterativeHighlight method, as well as a single highlight method that takes 
> all the information needed for either of them and I'm unsure if this is the 
> best way to do this.
> * The naming of things; I'm not sure if this is a big issue, or even an issue 
> at all, but I'd like to not break any conventions that may exist that I'm 
> unaware of.
> * How big of a deal is exposing the particular score of a segment from the 
> highlighting interface and does this need to be extended into the term vector 
> highlighting as well?
> * There are a lot of methods in the tv implementation that are marked 
> depracted; since this release will almost definitely break backwards 
> compatibility anyway, are these safe to remove?
> * Any other input anyone else may have :)
> I'm going to continue to work on things that I can work on, at least unless 
> someone tells me I'm wasting my time and will look forward to hearing you 
> guys' feedback! :)
> As a sidenote because it does

[jira] [Commented] (LUCENE-5532) AutomatonQuery.hashCode is not thread safe

2014-03-16 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937460#comment-13937460
 ] 

Robert Muir commented on LUCENE-5532:
-

I am also unhappy about this:
{code}
  // we already minimized the automaton in the ctor, so
  // this hash code will be the same for automata that
  // are the same:
  int automatonHashCode = automaton.getNumberOfStates() * 3 + 
automaton.getNumberOfTransitions() * 2;
{code}

This comment is out of date! So the whole algorithm is broken in some cases I 
think.
I would really prefer if we don't mess with mutable stuff in equals/hashcode. 
I think it would be better if this stuff was impl'ed in CompiledAutomaton? I'll 
give it a stab.

> AutomatonQuery.hashCode is not thread safe
> --
>
> Key: LUCENE-5532
> URL: https://issues.apache.org/jira/browse/LUCENE-5532
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> This hashCode is implemented based on  #states and #transitions.
> These methods use getNumberedStates() though, which may oversize itself 
> during construction and then "size down" when its done. But numberedStates is 
> prematurely set (before its "ready"), which can cause a hashCode call from 
> another thread to see a corrupt state... causing things like NPEs from null 
> states and other strangeness. I don't think we should set this variable until 
> its "finished".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5532) AutomatonQuery.hashCode is not thread safe

2014-03-16 Thread Robert Muir (JIRA)

Robert Muir created LUCENE-5532:
---

 Summary: AutomatonQuery.hashCode is not thread safe
 Key: LUCENE-5532
 URL: https://issues.apache.org/jira/browse/LUCENE-5532
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


This hashCode is implemented based on  #states and #transitions.

These methods use getNumberedStates() though, which may oversize itself during 
construction and then "size down" when its done. But numberedStates is 
prematurely set (before its "ready"), which can cause a hashCode call from 
another thread to see a corrupt state... causing things like NPEs from null 
states and other strangeness. I don't think we should set this variable until 
its "finished".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-fcs-b132) - Build # 9820 - Still Failing!

2014-03-16 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9820/
Java: 32bit/jdk1.8.0-fcs-b132 -client -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.client.solrj.impl.CloudSolrServerTest.testDistribSearch

Error Message:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 
127.0.0.1:55729 within 45000 ms

Stack Trace:
org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: 
Could not connect to ZooKeeper 127.0.0.1:55729 within 45000 ms
at 
__randomizedtesting.SeedInfo.seed([4BF5A51B47F29610:CA132B0330ADF62C]:0)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:150)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:101)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:91)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:89)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:83)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.setUp(AbstractDistribZkTestBase.java:70)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.setUp(AbstractFullDistribZkTestBase.java:201)
at 
org.apache.solr.client.solrj.impl.CloudSolrServerTest.setUp(CloudSolrServerTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:860)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.l

[jira] [Resolved] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-1604.
--

   Resolution: Fixed
Fix Version/s: 5.0
   4.8

OK, quite a bit has been checked in for this, but I think it's all done. Let's 
see what happens now!

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937428#comment-13937428
 ] 

ASF subversion and git services commented on SOLR-1604:
---

Commit 1578218 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1578218 ]

SOLR-1604: Wildcards, ORs etc inside Phrase Queries or 
'ComplexPhraseQueryParser support in Solr'

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

2014-03-16 Thread Da Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937412#comment-13937412
 ] 

Da Huang commented on LUCENE-4396:
--

I'm revising and polishing my proposal these days, and I have discovered a 
interesting thing. That is: if BooleanScorer supports required scorers in the 
way I have proposed, docIDs would be in acsending order in the bucket table. I 
think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() 
.nextDoc() etc. can be implemented. However, I'm not sure how it would affect 
the performance when it acts as a Not-Top Scorer. This is because when 
.nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window 
whose data may not be all useful.

I hope I have made my idea clear.

> BooleanScorer should sometimes be used for MUST clauses
> ---
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 100 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

2014-03-16 Thread Da Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937412#comment-13937412
 ] 

Da Huang edited comment on LUCENE-4396 at 3/17/14 2:14 AM:
---

I'm revising and polishing my proposal these days, and I have discovered an 
interesting thing. That is: if BooleanScorer supports required scorers in the 
way I have proposed, docIDs would be in acsending order in the bucket table. I 
think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() 
.nextDoc() etc. can be implemented. However, I'm not sure how it would affect 
the performance when it acts as a Not-Top Scorer. This is because when 
.nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window 
whose data may not be all useful.

I hope I have made my idea clear.


was (Author: dhuang):
I'm revising and polishing my proposal these days, and I have discovered a 
interesting thing. That is: if BooleanScorer supports required scorers in the 
way I have proposed, docIDs would be in acsending order in the bucket table. I 
think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() 
.nextDoc() etc. can be implemented. However, I'm not sure how it would affect 
the performance when it acts as a Not-Top Scorer. This is because when 
.nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window 
whose data may not be all useful.

I hope I have made my idea clear.

> BooleanScorer should sometimes be used for MUST clauses
> ---
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 100 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937398#comment-13937398
 ] 

ASF subversion and git services commented on SOLR-1604:
---

Commit 1578200 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1578200 ]

SOLR-1604: Wildcards, ORs etc inside Phrase Queries or 
'ComplexPhraseQueryParser support in Solr'

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4356) SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-4356.
--

Resolution: Invalid

> SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs
> -
>
> Key: SOLR-4356
> URL: https://issues.apache.org/jira/browse/SOLR-4356
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, Schema and Analysis, Tests
>Affects Versions: 4.1
> Environment: OS = Ubuntu 12.04
> Sun JAVA 7
> Max Java Heap Space = 2GB
> Apache Tomcat 7
> Hardware = {Intel core i3, 2GB RAM}
> Average no of fields in a Solr Doc = 100
>Reporter: Harish Verma
>  Labels: performance, test
> Fix For: 4.8
>
> Attachments: memorydump1.png, memorydump2.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> we are testing solr 4.1 running inside tomcat 7 and java 7 with  following 
> options
> JAVA_OPTS="-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC 
> -XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled 
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump"
> our source code looks like following:
> / START */
> int noOfSolrDocumentsInBatch = 0;
> for(int i=0 ; i<5000 ; i++) {
> SolrInputDocument solrInputDocument = getNextSolrInputDocument();
> server.add(solrInputDocument);
> noOfSolrDocumentsInBatch += 1;
> if(noOfSolrDocumentsInBatch == 10) {
> server.commit();
> noOfSolrDocumentsInBatch = 0;
> }
> }
> / END */
> the method "getNextSolrInputDocument()" generates a solr document with 100 
> fields (average). Around 50 of the fields are of "text_general" type.
> Some of the "test_general" fields consist of approx 1000 words rest consists 
> of few words. Ouf of total fields there are around 35-40 multivalued fields 
> (not of type "text_general").
> We are indexing all the fields but storing only 8 fields. Out of these 8 
> fields two are string type, five are long and one is boolean. So our index 
> size is only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB. 
> Why the memory is so high even though the index size is small?
> What is being stored in the memory? Our understanding is that after every 
> commit documents are flushed to the disk.So nothing should remain in RAM 
> after commit.
> We are using the following settings:
> server.commit() set waitForSearcher=true and waitForFlush=true
> solrConfig.xml has following properties set:
> directoryFactory = solr.MMapDirectoryFactory
> maxWarmingSearchers = 1
> text_general data type is being used as supplied in the schema.xml with the 
> solr setup.
> maxIndexingThreads = 8(default)
> 15000false
> We get Java heap Out Of Memory Error after commiting around 3990 solr 
> documents.Some of the snapshots of memory dump from profiler are attached.
> can somebody please suggest what should we do to minimize/optimize the memory 
> consumption in our case with the reasons?
> also suggest what should be optimal values and reason for following 
> parameters of solrConfig.xml 
> useColdSearcher - true/false?
> maxwarmingsearchers- number
> spellcheck-on/off?
> omitNorms=true/false?
> omitTermFreqAndPositions?
> mergefactor? we are using default value 10
> java garbage collection tuning parameters ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_51) - Build # 9817 - Failure!

2014-03-16 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9817/
Java: 64bit/jdk1.7.0_51 -XX:+UseCompressedOops -XX:+UseG1GC -XX:-UseSuperWord

All tests passed

Build Log:
[...truncated 43715 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:63: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:543: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:2342:
 java.net.UnknownHostException: issues.apache.org
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:618)
at 
sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:160)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:275)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:371)
at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932)
at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at 
org.apache.tools.ant.taskdefs.Get$GetThread.openConnection(Get.java:660)
at org.apache.tools.ant.taskdefs.Get$GetThread.get(Get.java:579)
at org.apache.tools.ant.taskdefs.Get$GetThread.run(Get.java:569)

Total time: 65 minutes 5 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 64bit/jdk1.7.0_51 -XX:+UseCompressedOops -XX:+UseG1GC 
-XX:-UseSuperWord
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5434) Create minimal solrcloud example directory

2014-03-16 Thread Alan Woodward (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved SOLR-5434.
-

Resolution: Won't Fix

I think I agree with Jan here.  An extra example dir is going to end up as just 
more noise.

> Create minimal solrcloud example directory
> --
>
> Key: SOLR-5434
> URL: https://issues.apache.org/jira/browse/SOLR-5434
> Project: Solr
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Minor
> Fix For: 4.8
>
>
> The various "intro to solr cloud" pages (for example 
> https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud)
>  currently tell new users to use the example/ directory as a basis for 
> setting up new cloud instances.  These directories contain, under the default 
> solr/ solr home directory, a single core, defined to point to the collection1 
> collection.
> It's not at all obvious that, to change the name of your collection, you have 
> to go and edit the core.properties file underneath the solr/ directory.  A 
> lot of users on the mailing list also seem to get confused by having to 
> include bootstrap_confdir and numShards the first time they run solr, but not 
> afterwards.  So here's a suggestion:
> * Have a new solrcloud/ directory in the example webapp that just contains a 
> solr.xml file
> * Change the startup example code to just include -Dsolr.solr.home and -DzkRun
> * Tell the user to then run zkcli to bootstrap their configuration (solr 
> startup and configuration loading are kept separate)
> * Tell the users to use the collections API to create a new collection, 
> naming it however they want (confignames, collection names and core names are 
> all kept separate)
> This way, there's a lot less 'magic' and hidden defaults involved, and all 
> the steps to get a cloud up and running (start processes, upload 
> configuration, create collection) are made distinguishable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Reducing the number of warnings in the codebase

2014-03-16 Thread Robert Muir

I agree with uwe. Start with javac, use -Werror flag and fix those first.
Stupid warnings like serialization are already disabled for javac in the
build.
On Mar 16, 2014 4:48 PM, "Uwe Schindler"  wrote:

> Hi,
>
> > Just because some tool expresses distaste, doesn't imply that everyone
> here
> > agrees that it's a problem we should fix.
>
> Yes that is my biggest problem. Lots of warnings by Eclipse are just
> bullshit because of the code style in Lucene and for example the way we do
> things - e.g., it complains about missing close() all the time, just
> because we use IOUtils.closeWhileHandlingExceptions() for that.
>
> > In my experience, the default Sonar rulesets contain many things that
> people
> > here are prone to disagree with. Start with serialVersionUID:
> > do we care? Why would we care? In what cases to we really believe that a
> > sane person would be using Java serialization with a Lucene/Solr class?
>
> We officially don't support serialization, so all warnings are useless.
> It's just Eclipse that complains for no reason.
>
> > Sonar can also be a bit cranky; it arranges for various tools to run via
> > mechanisms that sometimes conflict with the ways you might run them
> > yourself.
> >
> > So I'd suggest a process like:
> >
> > 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
> > 2. That ruleset is refined by experiment.
> > 3. We make violations fail the build.
> >
> > Then lather, rinse, repeat for other tools.
>
> Yes I agree. I am strongly against PMD or CheckStyle without our own
> rules. Forbiddeen-apis was invented because of the brokenness of PMD and
> CheckStyle to detect default Locale/Charset/Timezone violations (and also
> because those tools are slow).
> We should better fix out Eclipse Project generate to hide the warnings
> that are just wrong.
>
> I would prefer: Before we fix warnings by 3rd party tools like Eclipse, we
> should first fix only the warnings emitted by Javac. The others are just
> unimportant to me and I don't want to fix those which are just wrong for
> our code style.
>
> We already have ECJ in our build (to lint javadocs), we can make some
> Eclipse warnings fatal through the ecj config file in our SVN, to fail
> build on some warnings. I disagree with using PMD or Checkstyle, those
> tools are uncomplete and broken, sorry.
>
> Uwe
>
> > Once we have rulesets we agree are worth enforcing, we can look to Sonar
> > for a pretty way to visualize their results if we like.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> > commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937343#comment-13937343
 ] 

Chris Male commented on LUCENE-5376:


Sweet!

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Benson Margulies

I think we avoid bikeshed by making incremental changes. If you offer
a commit to turn off serial version UID whining, I'll +1 it. And then
we iterate, in small doses, agreeing to either spike the warning or
change the code.


In passing, I will warn you that the IDEs can be very stubborn; in
some cases, there is no way to avoid some amount of whining. Eclipse
used to insist on warning on every @SuppressWarnings that it didn't
understand. It might still.

On Sun, Mar 16, 2014 at 5:29 PM, Shawn Heisey  wrote:
> A starting comment: We could bikeshed for *years*.
>
> General thought: The more I think about it, the more I like the notion
> of confining most of the cleanup to trunk.  Actual bug fixes and changes
> that are relatively non-invasive should be backported.
>
> On 3/16/2014 2:48 PM, Uwe Schindler wrote:
>>> Just because some tool expresses distaste, doesn't imply that everyone here
>>> agrees that it's a problem we should fix.
>>
>> Yes that is my biggest problem. Lots of warnings by Eclipse are just 
>> bullshit because of the code style in Lucene and for example the way we do 
>> things - e.g., it complains about missing close() all the time, just because 
>> we use IOUtils.closeWhileHandlingExceptions() for that.
>
> My original thought on this was that we should use a combination of
> SuppressWarnings and actual code changes to eliminate most of the
> warnings that show up in the well-supported IDEs when they are
> configured with *default* settings.
>
> Uwe brings up a really good point that there are a number of completely
> useless warnings, but I think there's still value in looking through
> EVERY default IDE warning and evaluating each one on a case-by-case
> basis to decide whether that specific warning should be fixed or
> ignored.  It could be a sort of background task with an open Jira for
> tracking commits.  It could also be something that we decide isn't worth
> the effort.
>
>>> In my experience, the default Sonar rulesets contain many things that people
>>> here are prone to disagree with. Start with serialVersionUID:
>>> do we care? Why would we care? In what cases to we really believe that a
>>> sane person would be using Java serialization with a Lucene/Solr class?
>>
>> We officially don't support serialization, so all warnings are useless. It's 
>> just Eclipse that complains for no reason.
>
> Project-specific IDE settings for errors/warnings (set by the ant build
> target) will go a long way towards making the whole situation better.
> For the current stable branch, we should include settings for anything
> that we want to ignore on trunk, but only a subset of those problems
> that get elevated to error status.
>
>>> Sonar can also be a bit cranky; it arranges for various tools to run via
>>> mechanisms that sometimes conflict with the ways you might run them
>>> yourself.
>>>
>>> So I'd suggest a process like:
>>>
>>> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
>>> 2. That ruleset is refined by experiment.
>>> 3. We make violations fail the build.
>>>
>>> Then lather, rinse, repeat for other tools.
>>
>> Yes I agree. I am strongly against PMD or CheckStyle without our own rules. 
>> Forbiddeen-apis was invented because of the brokenness of PMD and CheckStyle 
>> to detect default Locale/Charset/Timezone violations (and also because those 
>> tools are slow).
>> We should better fix out Eclipse Project generate to hide the warnings that 
>> are just wrong.
>>
>> I would prefer: Before we fix warnings by 3rd party tools like Eclipse, we 
>> should first fix only the warnings emitted by Javac. The others are just 
>> unimportant to me and I don't want to fix those which are just wrong for our 
>> code style.
>>
>> We already have ECJ in our build (to lint javadocs), we can make some 
>> Eclipse warnings fatal through the ecj config file in our SVN, to fail build 
>> on some warnings. I disagree with using PMD or Checkstyle, those tools are 
>> uncomplete and broken, sorry.
>
> +1 all around.  I want to eliminate all the noise.  If we had IDE
> warnings measured in dozens instead of thousands, it would be a useful
> data point that wouldn't get ignored.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Shawn Heisey

A starting comment: We could bikeshed for *years*.

General thought: The more I think about it, the more I like the notion
of confining most of the cleanup to trunk.  Actual bug fixes and changes
that are relatively non-invasive should be backported.

On 3/16/2014 2:48 PM, Uwe Schindler wrote:
>> Just because some tool expresses distaste, doesn't imply that everyone here
>> agrees that it's a problem we should fix.
> 
> Yes that is my biggest problem. Lots of warnings by Eclipse are just bullshit 
> because of the code style in Lucene and for example the way we do things - 
> e.g., it complains about missing close() all the time, just because we use 
> IOUtils.closeWhileHandlingExceptions() for that.

My original thought on this was that we should use a combination of
SuppressWarnings and actual code changes to eliminate most of the
warnings that show up in the well-supported IDEs when they are
configured with *default* settings.

Uwe brings up a really good point that there are a number of completely
useless warnings, but I think there's still value in looking through
EVERY default IDE warning and evaluating each one on a case-by-case
basis to decide whether that specific warning should be fixed or
ignored.  It could be a sort of background task with an open Jira for
tracking commits.  It could also be something that we decide isn't worth
the effort.

>> In my experience, the default Sonar rulesets contain many things that people
>> here are prone to disagree with. Start with serialVersionUID:
>> do we care? Why would we care? In what cases to we really believe that a
>> sane person would be using Java serialization with a Lucene/Solr class?
> 
> We officially don't support serialization, so all warnings are useless. It's 
> just Eclipse that complains for no reason.

Project-specific IDE settings for errors/warnings (set by the ant build
target) will go a long way towards making the whole situation better.
For the current stable branch, we should include settings for anything
that we want to ignore on trunk, but only a subset of those problems
that get elevated to error status.

>> Sonar can also be a bit cranky; it arranges for various tools to run via
>> mechanisms that sometimes conflict with the ways you might run them
>> yourself.
>>
>> So I'd suggest a process like:
>>
>> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
>> 2. That ruleset is refined by experiment.
>> 3. We make violations fail the build.
>>
>> Then lather, rinse, repeat for other tools.
> 
> Yes I agree. I am strongly against PMD or CheckStyle without our own rules. 
> Forbiddeen-apis was invented because of the brokenness of PMD and CheckStyle 
> to detect default Locale/Charset/Timezone violations (and also because those 
> tools are slow).
> We should better fix out Eclipse Project generate to hide the warnings that 
> are just wrong.
> 
> I would prefer: Before we fix warnings by 3rd party tools like Eclipse, we 
> should first fix only the warnings emitted by Javac. The others are just 
> unimportant to me and I don't want to fix those which are just wrong for our 
> code style.
> 
> We already have ECJ in our build (to lint javadocs), we can make some Eclipse 
> warnings fatal through the ecj config file in our SVN, to fail build on some 
> warnings. I disagree with using PMD or Checkstyle, those tools are uncomplete 
> and broken, sorry.

+1 all around.  I want to eliminate all the noise.  If we had IDE
warnings measured in dozens instead of thousands, it would be a useful
data point that wouldn't get ignored.

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5488) Fix up test failures for Analytics Component

2014-03-16 Thread Houston Putman (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937298#comment-13937298
 ] 

Houston Putman edited comment on SOLR-5488 at 3/16/14 9:13 PM:
---

\*\*\* First Problem ***
In the fieldFacets,txt did you change it like this?
turning 'o.min.ff=string_sd' into 'o.min.ff=int_id' and 'o.min.ff=date_dtd' 
into 'o.min.ff=long_ld'

If so that doesn't really make since. Because these are calculating the minimum 
integer value faceting over the string values, and the minimum long value 
faceting over the date values. If you made the changes above, then you would be 
calculating the mininum int value faceting over all of the int values, etc. So 
the change isn't a good test even if it passes.
*

\*\*\* Second Problem ***
So the ordering is somewhat important. Solr, and the analytics component, 
automatically sort the results of field facet by the field being faceted on. I 
assumed this when writing the tests so if the results are coming back 
misordered, then either the test is either splitting up the values incorrectly 
or the analytics component is. Or the sorting gets mesed up somewhere in the 
test. I don't think the sorting is an issue in the component.

Looking through the test code, a lot has been changed from what I wrote. The 
part that I don't recognize at all is the parsing of the response through 
methods like getDoubleList() etc. in AbstractAnalyticsFacetTest.java. Since the 
tests worked before those methods were changed, I would suggest looking at 
those parsing methods first. 

(Side note: the methods in FieldFacetTest.java that still have FacetAsc, like 
medianFacetAscTest and sumOfSquaresFacetAscTest, should be renamed without the 
FacetAsc part so that they are named like the rest of the methods. (The 
FacetAsc functionality was taken out a while ago.) So the methods mentioned 
above should be renamed to medianTest and sumOfSquaresTest, respectively, in 
addition to the similarly named methods.
*

I would highly discourage you from changing the patterns in the 
fieldFacets.txt. I had trouble keeping all of that stuff straight while writing 
it and I don't think that is where the issue is. 

I'm not able to run the tests right now, so that's all of the help I can give.


was (Author: houstonputman):
*** First Problem ***
In the fieldFacets,txt did you change it like this?
turning 'o.min.ff=string_sd' into 'o.min.ff=int_id' and 'o.min.ff=date_dtd' 
into 'o.min.ff=long_ld'

If so that doesn't really make since. Because these are calculating the minimum 
integer value faceting over the string values, and the minimum long value 
faceting over the date values. If you made the changes above, then you would be 
calculating the mininum int value faceting over all of the int values, etc. So 
the change isn't a good test even if it passes.
*

*** Second Problem ***
So the ordering is somewhat important. Solr, and the analytics component, 
automatically sort the results of field facet by the field being faceted on. I 
assumed this when writing the tests so if the results are coming back 
misordered, then either the test is either splitting up the values incorrectly 
or the analytics component is. Or the sorting gets mesed up somewhere in the 
test. I don't think the sorting is an issue in the component.

Looking through the test code, a lot has been changed from what I wrote. The 
part that I don't recognize at all is the parsing of the response through 
methods like getDoubleList() etc. in AbstractAnalyticsFacetTest.java. Since the 
tests worked before those methods were changed, I would suggest looking at 
those parsing methods first. 

(Side note: the methods in FieldFacetTest.java that still have FacetAsc, like 
medianFacetAscTest and sumOfSquaresFacetAscTest, should be renamed without the 
FacetAsc part so that they are named like the rest of the methods. (The 
FacetAsc functionality was taken out a while ago.) So the methods mentioned 
above should be renamed to medianTest and sumOfSquaresTest, respectively, in 
addition to the similarly named methods.
*

I would highly discourage you from changing the patterns in the 
fieldFacets.txt. I had trouble keeping all of that stuff straight while writing 
it and I don't think that is where the issue is. 

I'm not able to run the tests right now, so that's all of the help I can give.

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.p

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937336#comment-13937336
 ] 

Michael McCandless commented on LUCENE-5376:


bq. What's motivated the new branch?

Oh, sorry, something went wrong with the merge props on the old branch, such 
that when I tried to merge as I always do ("svn merge ../trunk") it hit strange 
conflicts in files never changed on the branch and then stopped merging and 
asked me to resolve the conflicts and re-run the merge, which caused further 
conflicts in files that shouldn't have conflicted...

I figured it was easier to just rebranch.

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5528) Add context to AnalyzingInfixSuggester

2014-03-16 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5528:
---

Attachment: LUCENE-5528.patch

New patch, fixing one issue I hit when I was folding this into 
http://jirasearch.mikemccandless.com, which was the contexts should all be OR'd 
together not AND'd (ie, if a suggestion has any of the contexts, it's accepted).

I think it's ready.

> Add context to AnalyzingInfixSuggester
> --
>
> Key: LUCENE-5528
> URL: https://issues.apache.org/jira/browse/LUCENE-5528
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5528.patch, LUCENE-5528.patch
>
>
> Spinoff from LUCENE-5350.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937334#comment-13937334
 ] 

Ahmet Arslan commented on SOLR-1604:


following commands pass for me with last patch.
* ant -Dtests.disableHdfs=true -Dtests.badapples=false test
* ant -Dtestcase=QueryEqualityTest test
* ant -Dtestcase=TestComplexPhraseQuery test
* ant -Dtestcase=TestComplexPhraseQParserPlugin test

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Uwe Schindler

That are the default rules in the jar file. All so called unsafe methods and 
classes. The list is immense. 

http://code.google.com/p/forbidden-apis/source/browse/trunk#trunk%2Fsrc%2Fmain%2Fresources%2Fde%2Fthetaphi%2Fforbiddenapis%2Fsignatures

See also homepage of tool and its jar file.

Uwe

On 16. März 2014 21:53:46 MEZ, Ahmet Arslan  wrote:
>Hi Uwe,
>
>I looked for definitions under lucene/tools/forbiddenApis/*.txt files
>but I couldn't find.
>Where are those rule are defined? I am wondering about the syntax, can
>you point?
>
>Thanks,
>Ahmet
>
>
>
>On Sunday, March 16, 2014 10:40 PM, Uwe Schindler 
>wrote:
>> String.toUpperCase() and String.toLowerCase() without Locale. see :
>SOLR-
>> 2281 and LUCENE-2466
>
>Those are already detected by forbidden-apis.
>
>> Can ant precommit/forbidden-apis be used to detect above?
>> 
>> Ahmet
>> 
>> 
>> 
>> On Sunday, March 16, 2014 9:53 PM, Benson Margulies
>>  wrote:
>> Just because some tool expresses distaste, doesn't imply that
>everyone here
>> agrees that it's a problem we should fix.
>> 
>> In my experience, the default Sonar rulesets contain many things that
>people
>> here are prone to disagree with. Start with serialVersionUID:
>> do we care? Why would we care? In what cases to we really believe
>that a
>> sane person would be using Java serialization with a Lucene/Solr
>class?
>> 
>> Sonar can also be a bit cranky; it arranges for various tools to run
>via
>> mechanisms that sometimes conflict with the ways you might run them
>> yourself.
>> 
>> So I'd suggest a process like:
>> 
>> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
>> 2. That ruleset is refined by experiment.
>> 3. We make violations fail the build.
>> 
>> Then lather, rinse, repeat for other tools.
>> 
>> Once we have rulesets we agree are worth enforcing, we can look to
>Sonar
>> for a pretty way to visualize their results if we like.
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>For additional commands, e-mail: dev-h...@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Shawn Heisey

On 3/16/2014 1:52 PM, Benson Margulies wrote:
> Just because some tool expresses distaste, doesn't imply that everyone
> here agrees that it's a problem we should fix.

I had most of a reply written before I saw Uwe's response.  He brings up
some good points that made me re-think important parts of what I was
saying.  So, I start over. :)

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5376) Add a demo search server

2014-03-16 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937331#comment-13937331
 ] 

Chris Male edited comment on LUCENE-5376 at 3/16/14 8:59 PM:
-

What's motivated the new branch?


was (Author: cmale):
Why's motivated the new branch?

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937331#comment-13937331
 ] 

Chris Male commented on LUCENE-5376:


Why's motivated the new branch?

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Ahmet Arslan

Hi Uwe,

I looked for definitions under lucene/tools/forbiddenApis/*.txt files but I 
couldn't find.
Where are those rule are defined? I am wondering about the syntax, can you 
point?

Thanks,
Ahmet



On Sunday, March 16, 2014 10:40 PM, Uwe Schindler  wrote:
> String.toUpperCase() and String.toLowerCase() without Locale. see : SOLR-
> 2281 and LUCENE-2466

Those are already detected by forbidden-apis.

> Can ant precommit/forbidden-apis be used to detect above?
> 
> Ahmet
> 
> 
> 
> On Sunday, March 16, 2014 9:53 PM, Benson Margulies
>  wrote:
> Just because some tool expresses distaste, doesn't imply that everyone here
> agrees that it's a problem we should fix.
> 
> In my experience, the default Sonar rulesets contain many things that people
> here are prone to disagree with. Start with serialVersionUID:
> do we care? Why would we care? In what cases to we really believe that a
> sane person would be using Java serialization with a Lucene/Solr class?
> 
> Sonar can also be a bit cranky; it arranges for various tools to run via
> mechanisms that sometimes conflict with the ways you might run them
> yourself.
> 
> So I'd suggest a process like:
> 
> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
> 2. That ruleset is refined by experiment.
> 3. We make violations fail the build.
> 
> Then lather, rinse, repeat for other tools.
> 
> Once we have rulesets we agree are worth enforcing, we can look to Sonar
> for a pretty way to visualize their results if we like.
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Reducing the number of warnings in the codebase

2014-03-16 Thread Uwe Schindler

Hi,

> Just because some tool expresses distaste, doesn't imply that everyone here
> agrees that it's a problem we should fix.

Yes that is my biggest problem. Lots of warnings by Eclipse are just bullshit 
because of the code style in Lucene and for example the way we do things - 
e.g., it complains about missing close() all the time, just because we use 
IOUtils.closeWhileHandlingExceptions() for that.

> In my experience, the default Sonar rulesets contain many things that people
> here are prone to disagree with. Start with serialVersionUID:
> do we care? Why would we care? In what cases to we really believe that a
> sane person would be using Java serialization with a Lucene/Solr class?

We officially don't support serialization, so all warnings are useless. It's 
just Eclipse that complains for no reason.

> Sonar can also be a bit cranky; it arranges for various tools to run via
> mechanisms that sometimes conflict with the ways you might run them
> yourself.
> 
> So I'd suggest a process like:
> 
> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
> 2. That ruleset is refined by experiment.
> 3. We make violations fail the build.
> 
> Then lather, rinse, repeat for other tools.

Yes I agree. I am strongly against PMD or CheckStyle without our own rules. 
Forbiddeen-apis was invented because of the brokenness of PMD and CheckStyle to 
detect default Locale/Charset/Timezone violations (and also because those tools 
are slow).
We should better fix out Eclipse Project generate to hide the warnings that are 
just wrong.

I would prefer: Before we fix warnings by 3rd party tools like Eclipse, we 
should first fix only the warnings emitted by Javac. The others are just 
unimportant to me and I don't want to fix those which are just wrong for our 
code style.

We already have ECJ in our build (to lint javadocs), we can make some Eclipse 
warnings fatal through the ecj config file in our SVN, to fail build on some 
warnings. I disagree with using PMD or Checkstyle, those tools are uncomplete 
and broken, sorry.

Uwe

> Once we have rulesets we agree are worth enforcing, we can look to Sonar
> for a pretty way to visualize their results if we like.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-5530.


   Resolution: Fixed
Fix Version/s: 5.0

Thanks Ahmet!

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>  Labels: complexPhrase
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Reducing the number of warnings in the codebase

2014-03-16 Thread Uwe Schindler

> String.toUpperCase() and String.toLowerCase() without Locale. see : SOLR-
> 2281 and LUCENE-2466

Those are already detected by forbidden-apis.

> Can ant precommit/forbidden-apis be used to detect above?
> 
> Ahmet
> 
> 
> 
> On Sunday, March 16, 2014 9:53 PM, Benson Margulies
>  wrote:
> Just because some tool expresses distaste, doesn't imply that everyone here
> agrees that it's a problem we should fix.
> 
> In my experience, the default Sonar rulesets contain many things that people
> here are prone to disagree with. Start with serialVersionUID:
> do we care? Why would we care? In what cases to we really believe that a
> sane person would be using Java serialization with a Lucene/Solr class?
> 
> Sonar can also be a bit cranky; it arranges for various tools to run via
> mechanisms that sometimes conflict with the ways you might run them
> yourself.
> 
> So I'd suggest a process like:
> 
> 1. Someone proposes a set of (e.g.) checkstyle rules to live by.
> 2. That ruleset is refined by experiment.
> 3. We make violations fail the build.
> 
> Then lather, rinse, repeat for other tools.
> 
> Once we have rulesets we agree are worth enforcing, we can look to Sonar
> for a pretty way to visualize their results if we like.
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937327#comment-13937327
 ] 

ASF subversion and git services commented on LUCENE-5530:
-

Commit 1578158 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1578158 ]

LUCENE-5530 Allow the ComplexPhraseQueryParser to search order or un-order 
proximity queries.

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937326#comment-13937326
 ] 

Erick Erickson commented on SOLR-5488:
--

bq: I would highly discourage you from changing the patterns in the 
fieldFacets.txt. I had trouble keeping all of that stuff straight while writing 
it and I don't think that is where the issue is.

That's what I was worried about, although it seemed like interesting 
information that may shed light on what the _real_ issue is.

I disagree with the statement that these tests worked before. The whole point 
of the changes that have been made was because the tests _never_ worked 
consistently across all the test machine environments, which is why they were 
never merged into 4x. There would be random failures that have been going on 
for over three months. There haven't been any recently reported because the 
tests are ignored (see @BadApple in ExpressionTest and @Ignore in 
FieldFacetTest).

Ah well, I've got to go out now and won't get back to this for a bit.



> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, 
> SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, eoe.errors
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Ahmet Arslan

Hi,

Here are some rules :

Following String methods where left hand side is empty.
  String.replace()
  String.toUpperCase()
  String.toLowerCase()

  String.replaceFirst()
  String.trim()

In test cases (subblasses of SolrTestCaseJ4) methods without assertU(). see : 
SOLR-5685 
  adoc()
  optimize()
  commit()

String.toUpperCase() and String.toLowerCase() without Locale. see : SOLR-2281 
and LUCENE-2466


Can ant precommit/forbidden-apis be used to detect above?

Ahmet



On Sunday, March 16, 2014 9:53 PM, Benson Margulies  
wrote:
Just because some tool expresses distaste, doesn't imply that everyone
here agrees that it's a problem we should fix.

In my experience, the default Sonar rulesets contain many things that
people here are prone to disagree with. Start with serialVersionUID:
do we care? Why would we care? In what cases to we really believe that
a sane person would be using Java serialization with a Lucene/Solr
class?

Sonar can also be a bit cranky; it arranges for various tools to run
via mechanisms that sometimes conflict with the ways you might run
them yourself.

So I'd suggest a process like:

1. Someone proposes a set of (e.g.) checkstyle rules to live by.
2. That ruleset is refined by experiment.
3. We make violations fail the build.

Then lather, rinse, repeat for other tools.

Once we have rulesets we agree are worth enforcing, we can look to
Sonar for a pretty way to visualize their results if we like.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5870) Admin UI - Reload on Core Admin doesn't show errors

2014-03-16 Thread Stefan Matheis (steffkes) (JIRA)

Stefan Matheis (steffkes) created SOLR-5870:
---

 Summary: Admin UI - Reload on Core Admin doesn't show errors
 Key: SOLR-5870
 URL: https://issues.apache.org/jira/browse/SOLR-5870
 Project: Solr
  Issue Type: Bug
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.8, 5.0
 Attachments: SOLR-5870.patch

Christopher, friend of mine, made me realize, that the 'Reload' Button on the 
Core-Admin Screen doesn't show errors, if there are any - caused by a core 
reload.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5870) Admin UI - Reload on Core Admin doesn't show errors

2014-03-16 Thread Stefan Matheis (steffkes) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-5870:


Attachment: SOLR-5870.patch

> Admin UI - Reload on Core Admin doesn't show errors
> ---
>
> Key: SOLR-5870
> URL: https://issues.apache.org/jira/browse/SOLR-5870
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5870.patch
>
>
> Christopher, friend of mine, made me realize, that the 'Reload' Button on the 
> Core-Admin Screen doesn't show errors, if there are any - caused by a core 
> reload.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Benson Margulies

Just because some tool expresses distaste, doesn't imply that everyone
here agrees that it's a problem we should fix.

In my experience, the default Sonar rulesets contain many things that
people here are prone to disagree with. Start with serialVersionUID:
do we care? Why would we care? In what cases to we really believe that
a sane person would be using Java serialization with a Lucene/Solr
class?

Sonar can also be a bit cranky; it arranges for various tools to run
via mechanisms that sometimes conflict with the ways you might run
them yourself.

So I'd suggest a process like:

1. Someone proposes a set of (e.g.) checkstyle rules to live by.
2. That ruleset is refined by experiment.
3. We make violations fail the build.

Then lather, rinse, repeat for other tools.

Once we have rulesets we agree are worth enforcing, we can look to
Sonar for a pretty way to visualize their results if we like.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937312#comment-13937312
 ] 

ASF subversion and git services commented on LUCENE-5530:
-

Commit 1578148 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1578148 ]

LUCENE-5530 Allow the ComplexPhraseQueryParser to search order or un-order 
proximity queries.

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Shawn Heisey

On 3/16/2014 12:26 PM, Shawn Heisey wrote:
> Would it be too much administrative @#!* to create an umbrella issue?
> I'd suggest LUCENE-5130 for this purpose, except that I'm not 100%
> positive that failing the build is the right answer.  I fully understand
> the motivation ... it would certainly force us to face the issue!
> 
> A bunch of smaller issues could be created to tackle subsections of the
> code, or perhaps to tackle a particular type of warning.  This really
> doesn't change how invasive the patches would be, but if they come in
> smaller chunks, it might be easier to work around them.
> 
> When it comes to warnings about things like missing serialVersionUID,
> should we generate a random number for each class, or use a default value?

A further idea:  We could limit this cleanup to trunk.  I foresee three
main effects, none of which seems like a bad thing to me:

* We don't risk breaking the stable branch.
* The cleanup might reveal actual bugs or clearly broken code.
* Backporting gets harder, pushing us closer to the 5.0 release.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937302#comment-13937302
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1578144 from [~mikemccand] in branch 'dev/branches/lucene5376_2'
[ https://svn.apache.org/r1578144 ]

LUCENE-5376: merge trunk

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1130: POMs out of sync

2014-03-16 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1130/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.OverseerTest.testOverseerFailure

Error Message:
Could not register as the leader because creating the ephemeral registration 
node in ZooKeeper failed

Stack Trace:
org.apache.solr.common.SolrException: Could not register as the leader because 
creating the ephemeral registration node in ZooKeeper failed
at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:431)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:73)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:428)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:385)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:372)
at 
org.apache.solr.cloud.ShardLeaderElectionContextBase$1.execute(ElectionContext.java:127)
at 
org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:31)
at 
org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:122)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:164)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:108)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:156)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:289)
at 
org.apache.solr.cloud.OverseerTest$MockZKController.publishState(OverseerTest.java:155)
at 
org.apache.solr.cloud.OverseerTest.testOverseerFailure(OverseerTest.java:666)




Build Log:
[...truncated 53247 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:490: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:182: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77:
 Java returned: 1

Total time: 146 minutes 39 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2014-03-16 Thread Houston Putman (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937298#comment-13937298
 ] 

Houston Putman commented on SOLR-5488:
--

*** First Problem ***
In the fieldFacets,txt did you change it like this?
turning 'o.min.ff=string_sd' into 'o.min.ff=int_id' and 'o.min.ff=date_dtd' 
into 'o.min.ff=long_ld'

If so that doesn't really make since. Because these are calculating the minimum 
integer value faceting over the string values, and the minimum long value 
faceting over the date values. If you made the changes above, then you would be 
calculating the mininum int value faceting over all of the int values, etc. So 
the change isn't a good test even if it passes.
*

*** Second Problem ***
So the ordering is somewhat important. Solr, and the analytics component, 
automatically sort the results of field facet by the field being faceted on. I 
assumed this when writing the tests so if the results are coming back 
misordered, then either the test is either splitting up the values incorrectly 
or the analytics component is. Or the sorting gets mesed up somewhere in the 
test. I don't think the sorting is an issue in the component.

Looking through the test code, a lot has been changed from what I wrote. The 
part that I don't recognize at all is the parsing of the response through 
methods like getDoubleList() etc. in AbstractAnalyticsFacetTest.java. Since the 
tests worked before those methods were changed, I would suggest looking at 
those parsing methods first. 

(Side note: the methods in FieldFacetTest.java that still have FacetAsc, like 
medianFacetAscTest and sumOfSquaresFacetAscTest, should be renamed without the 
FacetAsc part so that they are named like the rest of the methods. (The 
FacetAsc functionality was taken out a while ago.) So the methods mentioned 
above should be renamed to medianTest and sumOfSquaresTest, respectively, in 
addition to the similarly named methods.
*

I would highly discourage you from changing the patterns in the 
fieldFacets.txt. I had trouble keeping all of that stuff straight while writing 
it and I don't think that is where the issue is. 

I'm not able to run the tests right now, so that's all of the help I can give.

> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, 
> SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, eoe.errors
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-1486.


Resolution: Fixed

The re-opens were from 2009. This stuff has been in Lucene for some time, and 
the comment "Reopening so we don't forget to do this one" makes me think this 
should have been closed a long time ago.

NOTE: we're also doing more work with this in the 4.8 time frame, thus it's 
getting some attention now.

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1480) SpellCheck in the same index

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1480.
-

   Resolution: Won't Fix
Fix Version/s: (was: 4.8)

We have DirectSolrSpellChecker now via LUCENE-2507

> SpellCheck in the same index
> 
>
> Key: SOLR-1480
> URL: https://issues.apache.org/jira/browse/SOLR-1480
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Shalin Shekhar Mangar
>
> There is really no reason why spell check has to be done through a separate 
> index. In most cases the spell check index is built from a Solr field. With a 
> few configured dynamic fields, an UpdateRequestProcessor and a 
> SearchComponent, spell checking can be done from the main Solr index.
> This eliminates the spellcheck build phases and spellcheck can get to use the 
> Java replication for free.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-828) A RequestProcessor to support updates

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-828.


   Resolution: Won't Fix
Fix Version/s: (was: 4.8)

I think this is redundant now that we have atomic updates via stored fields and 
transaction logs.

> A RequestProcessor to support updates
> -
>
> Key: SOLR-828
> URL: https://issues.apache.org/jira/browse/SOLR-828
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>
> This is same as SOLR-139. A new issue is opened so that the UpdateProcessor 
> approach is highlighted and we can easily focus on that solution. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-5530:
--

Assignee: Erick Erickson

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

2014-03-16 Thread Dmitry Kan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937293#comment-13937293
 ] 

Dmitry Kan commented on LUCENE-3758:


[~erickerickson] right, agree, this should be handled in another jira as a 
local param. We have implemented this as an operator as we allow mixing ordered 
and unordered clauses in the same query.

> Allow the ComplexPhraseQueryParser to search order or un-order proximity 
> queries.
> -
>
> Key: LUCENE-3758
> URL: https://issues.apache.org/jira/browse/LUCENE-3758
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.0-ALPHA
>Reporter: Tomás Fernández Löbbe
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-3758.patch, LUCENE-3758.patch, LUCENE-3758.patch
>
>
> The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" 
> value hardcoded to "true". This could be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread Ahmet Arslan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-1604:
---

Attachment: SOLR1604.patch

QueryEqualityTest added.

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-5530:
---

Attachment: (was: LUCENE-5530.patch)

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-5530:
---

Attachment: LUCENE-5530.patch

Tomas' patch from LUCENE-1486 so we can start iterating.

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937270#comment-13937270
 ] 

Erick Erickson commented on LUCENE-5530:


Bah! Ahmet is wy ahead of me! Removed the patch I just uploaded to reduce 
confusion.


>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-5531) Allow ComplexPhraseQuery to accept fields

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson closed LUCENE-5531.
--

Resolution: Duplicate

Ahmet and I created these at the same time

> Allow ComplexPhraseQuery to accept fields
> -
>
> Key: LUCENE-5531
> URL: https://issues.apache.org/jira/browse/LUCENE-5531
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.8, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5531.patch
>
>
> Breaking out a patch created by Tomas Fernandez Lobbe to from LUCENE-1486 so 
> we can track this  better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Ahmet Arslan (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated LUCENE-5530:
-

Attachment: LUCENE-5530.patch

Remove role:"de*" type queries from text case. One term inside quotes is 
somehow meaningless.

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch, LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5531) Allow ComplexPhraseQuery to accept fields

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-5531:
---

Attachment: LUCENE-5531.patch

Tomas' patch from LUCENE-1486 so we can look at it separately and track it.

> Allow ComplexPhraseQuery to accept fields
> -
>
> Key: LUCENE-5531
> URL: https://issues.apache.org/jira/browse/LUCENE-5531
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.8, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5531.patch
>
>
> Breaking out a patch created by Tomas Fernandez Lobbe to from LUCENE-1486 so 
> we can track this  better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5531) Allow ComplexPhraseQuery to accept fields

2014-03-16 Thread Erick Erickson (JIRA)

Erick Erickson created LUCENE-5531:
--

 Summary: Allow ComplexPhraseQuery to accept fields
 Key: LUCENE-5531
 URL: https://issues.apache.org/jira/browse/LUCENE-5531
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.8, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
 Fix For: 4.8, 5.0


Breaking out a patch created by Tomas Fernandez Lobbe to from LUCENE-1486 so we 
can track this  better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5488) Fix up test failures for Analytics Component

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937255#comment-13937255
 ] 

Erick Erickson commented on SOLR-5488:
--

re: SOLR-5685 and all of the sudden FieldFacetTest started breaking.

[~sbower] and [~houstonputman] I need a quick reply to this to make progress:

I got past the first problem by changing the fieldFacets.txt, i.e. 

o.min.ff=int_id <- o.min.ff=string_sd
o.min.ff=long_ld <- o.min.ff=date_dtd

o.max.ff=int_id <- o.max.ff=string_sd
o.max.ff=long_ld <- o.max.ff=date_dtd

I just noticed that it looked odd, and when I changed it I got past the first 
problem, but this is making changes without real understanding. There are other 
places with the older pattern like this one that don't seem to break the test 
in BeforeClass so it makes me nervous:

o.count.s.str=count(string_sd)
o.count.s.date=count(date_dtd)
o.count.ff=int_id
o.count.ff=long_ld

So I'm not sure whether the changes I made are just irrelevant or perhaps mask 
something completely different.

**
Second problem:

Everything else is failing, things like:

Expected :[25.0, 26.0, 28.5, 27.0, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 24.0]
Actual  :[25.0, 26.0, 27.0, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 28.5, 24.0]

Same numbers, just not in the same order. This one from:
at 
org.apache.solr.analytics.facet.FieldFacetTest.medianFacetAscTest(FieldFacetTest.java:561)

So is ordering important here or should the values be sorted before passing to
assertEquals? As you an tell I have no real deep-level understanding of what 
this code is _supposed_ to do, which makes it difficult to have confidence if I 
sorted the expected and actual before passing to assertEquals or used  ;).

I can make the changes above if I have some clue that I'm not doing something 
foolish. That is, change all the patterns in the fieldFacets.txt and change the 
collections passed in to assertEquals to be sorted or some such.

Thanks!


> Fix up test failures for Analytics Component
> 
>
> Key: SOLR-5488
> URL: https://issues.apache.org/jira/browse/SOLR-5488
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.7, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, 
> SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, SOLR-5488.patch, eoe.errors
>
>
> The analytics component has a few test failures, perhaps 
> environment-dependent. This is just to collect the test fixes in one place 
> for convenience when we merge back into 4.x



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Shawn Heisey

On 3/16/2014 5:35 AM, Michael McCandless wrote:
> On Sun, Mar 16, 2014 at 6:09 AM, Furkan KAMACI  wrote:
> 
>> I've run FindBugs for Lucene/Solr project. If you use Intellij IDEA you can
>> group the warnings according to their importance. I've opened issues and
>> attached patches for top level warnings/errors (and some others) that
>> FindBugs found.
>>
>> On the other hand I have another suggestion for Lucene/Solr project. When I
>> develop or lead projects I use Sonar. It's so good and it runs really nice
>> open source projects to analyze your code. FindBugs, PMD, Jacoco are just
>> some of them. It also calculates the method complexities, LoC and etc. You
>> can see a live example from here:
>> https://sonar.springsource.org/dashboard/index/4824
> 
> +1, Sonar looks really nice!
> 
>> I can be volunteer to integrate Sonar into Lucene/Solr project.
> 
> Thank you Furkan.

I haven't yet looked, but the *idea* of Sonar sounds quite awesome.  +1
for me on a thank you, Furkan.

The "ides of march" spam from Jira on version 4.7 had one benefit
relating to this discussion: It put LUCENE-5130 on my radar.

Would it be too much administrative @#!* to create an umbrella issue?
I'd suggest LUCENE-5130 for this purpose, except that I'm not 100%
positive that failing the build is the right answer.  I fully understand
the motivation ... it would certainly force us to face the issue!

A bunch of smaller issues could be created to tackle subsections of the
code, or perhaps to tackle a particular type of warning.  This really
doesn't change how invasive the patches would be, but if they come in
smaller chunks, it might be easier to work around them.

When it comes to warnings about things like missing serialVersionUID,
should we generate a random number for each class, or use a default value?

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-3758.


   Resolution: Fixed
Fix Version/s: 5.0

Thanks Ahmet!

> Allow the ComplexPhraseQueryParser to search order or un-order proximity 
> queries.
> -
>
> Key: LUCENE-3758
> URL: https://issues.apache.org/jira/browse/LUCENE-3758
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.0-ALPHA
>Reporter: Tomás Fernández Löbbe
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-3758.patch, LUCENE-3758.patch, LUCENE-3758.patch
>
>
> The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" 
> value hardcoded to "true". This could be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3578) TestSort testParallelMultiSort reproducible seed failure

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937251#comment-13937251
 ] 

ASF subversion and git services commented on LUCENE-3578:
-

Commit 1578134 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1578134 ]

Fix for LUCENE-3578, the ability to specify order for complex phrase queries

> TestSort testParallelMultiSort reproducible seed failure
> 
>
> Key: LUCENE-3578
> URL: https://issues.apache.org/jira/browse/LUCENE-3578
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: selckin
>Assignee: Michael McCandless
> Fix For: 4.0-ALPHA
>
>
> trunk r1202157
> {code}
> [junit] Testsuite: org.apache.lucene.search.TestSort
> [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.978 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestSort 
> -Dtestmethod=testParallelMultiSort 
> -Dtests.seed=-2996f3e0f5d118c2:32c8e62dd9611f63:7a90f44586ae8263 
> -Dargs="-Dfile.encoding=UTF-8"
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-1,5,main]
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-2,5,main]
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-3,5,main]
> [junit] NOTE: test params are: codec=Lucene40: 
> {short=Lucene40(minBlockSize=98 maxBlockSize=214), 
> contents=PostingsFormat(name=MockSep), byte=PostingsFormat(name=SimpleText), 
> int=Pulsing40(freqCutoff=4 minBlockSize=58 maxBlockSize=186), 
> string=PostingsFormat(name=NestedPulsing), i18n=Lucene40(minBlockSize=98 
> maxBlockSize=214), long=PostingsFormat(name=Memory), 
> double=Pulsing40(freqCutoff=4 minBlockSize=58 maxBlockSize=186), 
> parser=MockVariableIntBlock(baseBlockSize=88), float=Lucene40(minBlockSize=98 
> maxBlockSize=214), custom=PostingsFormat(name=MockRandom)}, 
> sim=RandomSimilarityProvider(queryNorm=false,coord=false): 
> {short=BM25(k1=1.2,b=0.75), tracer=DFR I(ne)B2, byte=DFR I(ne)B3(800.0), 
> contents=IB LL-LZ(0.3), int=DFR I(n)BZ(0.3), string=IB LL-D3(800.0), i18n=DFR 
> GB2, double=DFR I(ne)B2, long=DFR GB1, parser=DFR GL2, 
> float=BM25(k1=1.2,b=0.75), custom=DFR I(ne)Z(0.3)}, locale=ga_IE, 
> timezone=America/Louisville
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestSort]
> [junit] NOTE: Linux 3.0.6-gentoo amd64/Sun Microsystems Inc. 1.6.0_29 
> (64-bit)/cpus=8,threads=4,free=78022136,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testParallelMultiSort(org.apache.lucene.search.TestSort): FAILED
> [junit] expected:<[ZJ]I> but was:<[JZ]I>
> [junit] junit.framework.AssertionFailedError: expected:<[ZJ]I> but 
> was:<[JZ]I>
> [junit] at 
> org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1245)
> [junit] at 
> org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1216)
> [junit] at 
> org.apache.lucene.search.TestSort.runMultiSorts(TestSort.java:1202)
> [junit] at 
> org.apache.lucene.search.TestSort.testParallelMultiSort(TestSort.java:855)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
> [junit] at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
> [junit] at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)
> [junit] 
> [junit] 
> [junit] Test org.apache.lucene.search.TestSort FAILED
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937252#comment-13937252
 ] 

Erick Erickson commented on LUCENE-3758:


Fixed:
trunk: r - 1578113
4x: r - 1578134

> Allow the ComplexPhraseQueryParser to search order or un-order proximity 
> queries.
> -
>
> Key: LUCENE-3758
> URL: https://issues.apache.org/jira/browse/LUCENE-3758
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.0-ALPHA
>Reporter: Tomás Fernández Löbbe
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-3758.patch, LUCENE-3758.patch, LUCENE-3758.patch
>
>
> The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" 
> value hardcoded to "true". This could be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5674) The rows improvement for QueryComponet

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937246#comment-13937246
 ] 

Shalin Shekhar Mangar commented on SOLR-5674:
-

I haven't gone through the patch yet but is SOLR-5463 useful for you?

> The rows improvement for QueryComponet
> --
>
> Key: SOLR-5674
> URL: https://issues.apache.org/jira/browse/SOLR-5674
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - Clustering
>Affects Versions: 4.3.1, 4.5.1, 4.6
> Environment: JVM7 
>Reporter: Raintung Li
>  Labels: QueryComponet, rows
> Attachments: SOLR-5674.txt
>
>
> For solr Rows issues:
> 1. Solr don't provide get full results API, usually customer will set the 
> rows is Integer.maxvalue try to get the full results that cause the other 
> issue. OOM issue in solr 
> :SOLR-5661(https://issues.apache.org/jira/browse/SOLR-5661)
> How about open the API for rows=-1? That means return full results. Sometimes 
> the result count will every biggest that will cause the heap OOM, but usually 
> we can suggest the customer to make sure the result really small that can 
> call this API. Actually we don't want to make the second call to get full 
> results. For one is call API get total number, for two get the result set 
> rows into total number.
> 2. A litter improve, because every shard node return results has been 
> ordered. Add first shard list into the PriorityQueue that don't need compare 
> again only filter the same unique id.
> 3. Create the PriorityQueue after check all shard return sizes, that can 
> avoid the unnecessary memory cost especially biggest rows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937245#comment-13937245
 ] 

Ahmet Arslan commented on LUCENE-1486:
--

One last thing that might come out of this jira is [~terje_eggestad] this 
[comment|https://issues.apache.org/jira/browse/LUCENE-1486?focusedCommentId=12900278&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12900278]
 and his fix. However I couldn't re-produce the problem he reported with new 
MockAnalyzer(random()); This problem could be analyzer specific.

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937244#comment-13937244
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1578133 from [~mikemccand] in branch 'dev/branches/lucene5376_2'
[ https://svn.apache.org/r1578133 ]

LUCENE-5376: carry over last branch

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937242#comment-13937242
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1578132 from [~mikemccand] in branch 'dev/branches/lucene5376_2'
[ https://svn.apache.org/r1578132 ]

LUCENE-5376: make new branch

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-4777) Handle SliceState in the Admin UI

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-4777:
---

Assignee: Shalin Shekhar Mangar

> Handle SliceState in the Admin UI
> -
>
> Key: SOLR-4777
> URL: https://issues.apache.org/jira/browse/SOLR-4777
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud, web gui
>Affects Versions: 4.3
>Reporter: Anshum Gupta
>Assignee: Shalin Shekhar Mangar
> Fix For: 4.8
>
> Attachments: SOLR-4777.patch, SOLR-4777.patch
>
>
> The Solr admin UI as of now does take Slice state into account.
> We need to have that differentiated.
> There are three states:
> # The default is "active"
> # "construction" (used during shard splitting for new sub shards),
> # 'recovery' (state is changed from construction to recovery once split is 
> complete and we are waiting for sub-shard replicas to recover from their 
> respective leaders), and
> # "inactive" - the parent shard is set to this state after split is complete
> A slice/shard which is "inactive" will not accept traffic (i.e. it will 
> re-route traffic to sub shards) even though the nodes inside this shard show 
> up as green.
> We should show the "inactive" shards in a different color to highlight this 
> behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5498) Allow DIH to report its state to ZooKeeper

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-5498:
---

Assignee: Shalin Shekhar Mangar

> Allow DIH to report its state to ZooKeeper
> --
>
> Key: SOLR-5498
> URL: https://issues.apache.org/jira/browse/SOLR-5498
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 4.5
>Reporter: Rafał Kuć
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8
>
> Attachments: SOLR-5498.patch, SOLR-5498_version.patch
>
>
> I thought it may be good to be able for DIH to be fully controllable by Solr 
> in SolrCloud. So when once instance fails another could be automatically 
> started and so on. This issue is the first small step there - it makes 
> SolrCloud report DIH state to ZooKeeper once it is started and remove its 
> state once it is stopped or indexing job failed. In non-cloud mode that 
> functionality is not used. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937235#comment-13937235
 ] 

Ahmet Arslan commented on LUCENE-1486:
--

One thing is , [~tomasflobbe] has reported one problem in his 
[comment|https://issues.apache.org/jira/browse/LUCENE-1486?focusedCommentId=13202409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13202409]
 . And he has provided a solution too. 
Its about "fred*" kind of queries. There is only one term inside quotes.  

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937228#comment-13937228
 ] 

Ahmet Arslan commented on LUCENE-1486:
--

bq. What about the stopwords bit? yet another JIRA?
There is no patch/solution for that in ComplexPhraseQueryParser.  Tim says 
about the topic : 

bq. The root of this problem is that SpanNearQuery has no good way to handle 
stopwords in a way analagous to PhraseQuery.

I suggested [~nikhil500] to use a modified StopwordFilter ( I sent the filter 
to him offlist) that does not remove but instead reduces given stop words to an 
impossible token. 
"the" => "ImpossibleToken"
"a" => "ImpossibleToken"
"for" => "ImpossibleToken"

I think we don't need a jira for this functionality but we can document this as 
limitation and workaround for this.

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4834) Surround QParser should enable query text analysis

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937223#comment-13937223
 ] 

Ahmet Arslan commented on SOLR-4834:


Hey [~isaachebsh], did you check LUCENE-5205 ? [~paul.elsc...@xs4all.nl] says :
bq. I think this has a lot more possibilities than the surround parser. So much 
more that this might actually replace the surround parser.

> Surround QParser should enable query text analysis
> --
>
> Key: SOLR-4834
> URL: https://issues.apache.org/jira/browse/SOLR-4834
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 4.3
>Reporter: Isaac Hebsh
>  Labels: analysis, qparserplugin, surround
> Fix For: 4.8
>
>
> When using surround query parser, the query terms are not being analyzed. The 
> basic example is lower case, of course. This is probably an intended 
> behaviour, not a bug.
> I suggest one more query parameter, which determines whether to do analysis 
> or not. something like this:
> {code}
> _query_:"{!surround df=myfield analyze=true}SpinPoint 7n GB18030"
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-1604:
-

Attachment: SOLR-1604.patch

Added entry to CHANGES.txt for Solr

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
> ComplexPhrase-4.2.1.zip, ComplexPhrase-4.7.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, 
> SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, 
> SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3578) TestSort testParallelMultiSort reproducible seed failure

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937214#comment-13937214
 ] 

ASF subversion and git services commented on LUCENE-3578:
-

Commit 1578113 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1578113 ]

Fix for LUCENE-3578, the ability to specify order for complex phrase queries

> TestSort testParallelMultiSort reproducible seed failure
> 
>
> Key: LUCENE-3578
> URL: https://issues.apache.org/jira/browse/LUCENE-3578
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: selckin
>Assignee: Michael McCandless
> Fix For: 4.0-ALPHA
>
>
> trunk r1202157
> {code}
> [junit] Testsuite: org.apache.lucene.search.TestSort
> [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.978 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestSort 
> -Dtestmethod=testParallelMultiSort 
> -Dtests.seed=-2996f3e0f5d118c2:32c8e62dd9611f63:7a90f44586ae8263 
> -Dargs="-Dfile.encoding=UTF-8"
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-1,5,main]
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-2,5,main]
> [junit] WARNING: test method: 'testParallelMultiSort' left thread 
> running: Thread[pool-1-thread-3,5,main]
> [junit] NOTE: test params are: codec=Lucene40: 
> {short=Lucene40(minBlockSize=98 maxBlockSize=214), 
> contents=PostingsFormat(name=MockSep), byte=PostingsFormat(name=SimpleText), 
> int=Pulsing40(freqCutoff=4 minBlockSize=58 maxBlockSize=186), 
> string=PostingsFormat(name=NestedPulsing), i18n=Lucene40(minBlockSize=98 
> maxBlockSize=214), long=PostingsFormat(name=Memory), 
> double=Pulsing40(freqCutoff=4 minBlockSize=58 maxBlockSize=186), 
> parser=MockVariableIntBlock(baseBlockSize=88), float=Lucene40(minBlockSize=98 
> maxBlockSize=214), custom=PostingsFormat(name=MockRandom)}, 
> sim=RandomSimilarityProvider(queryNorm=false,coord=false): 
> {short=BM25(k1=1.2,b=0.75), tracer=DFR I(ne)B2, byte=DFR I(ne)B3(800.0), 
> contents=IB LL-LZ(0.3), int=DFR I(n)BZ(0.3), string=IB LL-D3(800.0), i18n=DFR 
> GB2, double=DFR I(ne)B2, long=DFR GB1, parser=DFR GL2, 
> float=BM25(k1=1.2,b=0.75), custom=DFR I(ne)Z(0.3)}, locale=ga_IE, 
> timezone=America/Louisville
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestSort]
> [junit] NOTE: Linux 3.0.6-gentoo amd64/Sun Microsystems Inc. 1.6.0_29 
> (64-bit)/cpus=8,threads=4,free=78022136,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testParallelMultiSort(org.apache.lucene.search.TestSort): FAILED
> [junit] expected:<[ZJ]I> but was:<[JZ]I>
> [junit] junit.framework.AssertionFailedError: expected:<[ZJ]I> but 
> was:<[JZ]I>
> [junit] at 
> org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1245)
> [junit] at 
> org.apache.lucene.search.TestSort.assertMatches(TestSort.java:1216)
> [junit] at 
> org.apache.lucene.search.TestSort.runMultiSorts(TestSort.java:1202)
> [junit] at 
> org.apache.lucene.search.TestSort.testParallelMultiSort(TestSort.java:855)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:523)
> [junit] at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:149)
> [junit] at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:51)
> [junit] 
> [junit] 
> [junit] Test org.apache.lucene.search.TestSort FAILED
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Ahmet Arslan (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated LUCENE-5530:
-

Attachment: LUCENE-5530.patch

Bring fielded query support by changing the visibility of "field" in the 
QueryParserBase class from "package-private" to "protected".

>  ComplexPhraseQueryParser throws ParseException for fielded queries
> ---
>
> Key: LUCENE-5530
> URL: https://issues.apache.org/jira/browse/LUCENE-5530
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/queryparser
>Affects Versions: 4.7
>Reporter: Ahmet Arslan
>  Labels: complexPhrase
> Fix For: 4.8
>
> Attachments: LUCENE-5530.patch
>
>
> Queries using QueryParser's non-default field e.g.
> author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
> following code snippet 
> {code}
> ComplexPhraseQueryParser qp = new 
> ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
> MockAnalyzer(new Random()));
>   qp.parse("author:\"fred* smith\"") ;
> {code}
> yields 
> {noformat}
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
> clause for field "defaultField" nested in phrase  for field "author"
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
>   at 
> org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
>   ... 49 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

2014-03-16 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-3758:
---

Attachment: LUCENE-3758.patch

Ahmet's patch plus entry in CHANGES.txt

> Allow the ComplexPhraseQueryParser to search order or un-order proximity 
> queries.
> -
>
> Key: LUCENE-3758
> URL: https://issues.apache.org/jira/browse/LUCENE-3758
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.0-ALPHA
>Reporter: Tomás Fernández Löbbe
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-3758.patch, LUCENE-3758.patch, LUCENE-3758.patch
>
>
> The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" 
> value hardcoded to "true". This could be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937207#comment-13937207
 ] 

Erick Erickson commented on LUCENE-1486:


What about the stopwords bit? yet another JIRA?



> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937204#comment-13937204
 ] 

Ahmet Arslan commented on LUCENE-1486:
--

bq. Should we raise Nikhil Chhaochharia's comment in a new JIRA to test at 
least?
I created LUCENE-5530 for fielded query support. 

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5530) ComplexPhraseQueryParser throws ParseException for fielded queries

2014-03-16 Thread Ahmet Arslan (JIRA)

Ahmet Arslan created LUCENE-5530:


 Summary:  ComplexPhraseQueryParser throws ParseException for 
fielded queries
 Key: LUCENE-5530
 URL: https://issues.apache.org/jira/browse/LUCENE-5530
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 4.7
Reporter: Ahmet Arslan
 Fix For: 4.8


Queries using QueryParser's non-default field e.g.
author:"j* smith" are not supported by ComplexPhraseQueryParser. For example 
following code snippet 

{code}
ComplexPhraseQueryParser qp = new 
ComplexPhraseQueryParser(TEST_VERSION_CURRENT, "defaultField", new 
MockAnalyzer(new Random()));
  qp.parse("author:\"fred* smith\"") ;
{code}

yields 

{noformat}
Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot have 
clause for field "defaultField" nested in phrase  for field "author"
at 
org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.checkPhraseClauseIsForSameField(ComplexPhraseQueryParser.java:147)
at 
org.apache.lucene.queryparser.complexPhrase.ComplexPhraseQueryParser.newTermQuery(ComplexPhraseQueryParser.java:135)
... 49 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937196#comment-13937196
 ] 

Erick Erickson commented on LUCENE-3758:


Just to make sure I understand Dimitry's comment about the # operator. I don't 
see anything in this patch on a quick look that references a new operator, so 
that's a separate issue, correct? I see in the related SOLR-1604 patch the 
ability to specify inOrder="true|false" as a local parameter, so this 
functionality is available at that level.

Frankly, I'd rather not introduce a new operator at this stage, let's get the 
underlying functionality in place and treat any new operators as a separate 
issue if we add one it at all.

Any responses to the comment by [~rcmuir]? My quick response is that I've seen 
use-cases like this:
"Find all the variants of "john anderson, including 'jonathan anderson', 'jon 
ivan gregory anderson' but not 'eric anderson and jonathan jones' ". Contrived 
a bit, but you get the idea. Specifying slop doesn't allow this case, but slop 
with specified order does.

I'm going to be committing this this, along with SOLR-1604 today unless there 
are objections. The patch doesn't change current behavior so it seems pretty 
safe.

> Allow the ComplexPhraseQueryParser to search order or un-order proximity 
> queries.
> -
>
> Key: LUCENE-3758
> URL: https://issues.apache.org/jira/browse/LUCENE-3758
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 4.0-ALPHA
>Reporter: Tomás Fernández Löbbe
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-3758.patch, LUCENE-3758.patch
>
>
> The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" 
> value hardcoded to "true". This could be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SolrCloud and https

2014-03-16 Thread Steve Davids

Just to give everyone an update - I upgraded our SolrCloud cluster from
4.3.1 (manually patched for SSL) -> 4.7 and have ran into a couple of
issues, though I have created Jira tickets for them and some have already
been committed.

   1. The update shard handler wasn't using the system properties to pick
   up the javax.net.ssl.* configuration
(SOLR-5866
   )
   2. Overseer collector doesn't use the right scheme in a small use case,
   only came across this perusing the code but will impact some select admin
   calls (SOLR-5867 )
   3.  Wiring up a custom HttpClientConfigurer has proven to be a bit more
   challenging as the HttpClientUtil needs to set the new configurer before
   Solr begins to be constructed, so I was left with options such as setting
   up a webapp-listener in the webdefaults.xml to get a hook in before the
   Solr servlet gets loaded. If there is a better way to achieve this please
   let us know, I thought there *must* be a better way but it eluded me. For
   simplicity sake I ended up just patching in my custom HttpClientConfigurer
   into the war itself.

After further contemplation I thought Solr should be lenient in which
certificates are acceptable to communicate within the cluster itself
instead of the default HttpClient configuration that is more strict for
communication to external sources. With that in mind I created a ticket to
set the default host name verifier to allow all hostnames
(SOLR-5868)
and we may consider allowing self-signed certs as well. With those changes
the need to wire up a custom HttpClientConfigurer becomes greatly reduced.

To get 4.7 up and running in it's current fashion, you can use the
following custom HttpClientConfigurer (also fixes problem with SOLR-5866):
https://gist.github.com/sdavids13/9577027

-Steve

On Thu, Mar 13, 2014 at 10:31 PM, Steve Davids  wrote:

> Glad to hear it works for you! It would be nice if we could upload the
> json file via the zk bootstrapping, it sure would make it a bit simpler.
>
> -Steve
>
> On Mar 13, 2014, at 10:19 PM, Erick Erickson 
> wrote:
>
> > Darn Windows. It turns out that this works (thanks Steve!)
> >
> > ./zkcli.sh -zkhost localhost:9983 -cmd put /clusterprops.json
> > '{"urlScheme":"https"}'
> >
> > but only if you escape the double quotes and remove the ticks, as:
> >
> > ./zkcli.sh -zkhost localhost:9983 -cmd put /clusterprops.json
> > {\"urlScheme\":\"https\"}
> >
> > Otherwise clusterprops.json contains the ticks as well.
> >
> > Got it working though
> >
> > On Thu, Mar 13, 2014 at 9:43 AM, Erick Erickson 
> wrote:
> >> I was thinking about that but haven't had a chance to catch my breath.
> >>
> >> Thanks for letting me know where the link is...
> >>
> >> Erick
> >>
> >> On Thu, Mar 13, 2014 at 9:08 AM, Cassandra Targett
> >>  wrote:
> >>> This needs to also make its way into the Solr Ref Guide - stuff
> documented
> >>> on the wiki doesn't automatically get into the Solr Reference Guide
> without
> >>> human intervention.
> >>>
> >>> There is an issue already to document this in the guide, so if you do
> add
> >>> something to the Solr Wiki, please add a link to the page to
> >>> https://issues.apache.org/jira/browse/SOLR-5757 so it can be
> officially
> >>> documented.
> >>>
> >>> Thanks,
> >>> Cassandra
> >>>
> >>>
> >>> On Wed, Mar 12, 2014 at 7:19 PM, Erick Erickson <
> erickerick...@gmail.com>
> >>> wrote:
> 
>  Steve:
> 
>  It would be a great service if you were willing to document this on
>  the Wiki. If you don't already have contributor rights, just create a
>  logon on the Wiki, send us your logon ID and we'll add you to the
>  approved editors list.
> 
>  A bit of background: We used to let anyone edit the Wiki, but then
>  started getting hit with a billion spam pages so had to lock it down.
>  As long as we're convinced it's a real person asking for edit rights,
>  they're freely granted!
> 
>  Best,
>  Erick
> 
> 
>  On Wed, Mar 12, 2014 at 8:15 PM, Steve Davids 
> wrote:
> > I will be upgrading my SolrCloud cluster at work in a couple of days
> > (hand
> > patched former builds) will let everyone know if there are any other
> > gothchyas. I know depending on different cases the need to bundle
> your
> > own
> > HttpClientConfigurer to use the AllowAllHostnameVerifier (if using a
> > single
> > cert for all instances) or to add the TrustedSelfSignedStrategy if
> using
> > two-way SSL w/ self-signed certs.
> >
> > -Steve
> >
> > On Mar 12, 2014, at 8:05 PM, Erick Erickson  >
> > wrote:
> >
> > Steve:
> >
> > Thanks, I confess confusion about all things HTTPS. I'll turn this
> > over to the people who _do_ know about it in the morning, this is a
> > great help in that it tells us whe

Re: JIRA SPAM HELP IN GMAIL

2014-03-16 Thread Yonik Seeley

On Sun, Mar 16, 2014 at 10:50 AM, Furkan KAMACI  wrote:
> I use Gmail too and I've divided dev e-mails with filters and assigned them
> labels. I have dev(includes everything), dev-jira, and dev-filtered (does
> not include jira e-mails). Also you can filter for VOTE, CONF, JENKINS or
> etc too. So it is easy to focus on what you want.

Yeah, I already make heavy use of filters.
This was really about removing a ton of emails that had no information
content though (I normally do want to see JIRA updates otherwise).

-Yonik
http://heliosearch.org - solve Solr GC pauses with off-heap filters
and fieldcache

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5868) HttpClient should be configured to use ALLOW_ALL_HOSTNAME hostname verifier to simplify SSL setup

2014-03-16 Thread Steve Davids (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-5868:
---

Attachment: SOLR-5868.patch

Patch attached to work in the current form of the trunk (non HttpClientBuilder 
version).

> HttpClient should be configured to use ALLOW_ALL_HOSTNAME hostname verifier 
> to simplify SSL setup
> -
>
> Key: SOLR-5868
> URL: https://issues.apache.org/jira/browse/SOLR-5868
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.7
>Reporter: Steve Davids
> Fix For: 4.8
>
> Attachments: SOLR-5868.patch
>
>
> The default HttpClient hostname verifier is the 
> BROWSER_COMPATIBLE_HOSTNAME_VERIFIER which verifies the hostname that is 
> being connected to matches the hostname presented within the certificate. 
> This is meant to protect clients that are making external requests out across 
> the internet, but requests within the the SOLR cluster should be trusted and 
> can be relaxed to simplify the SSL/certificate setup process.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5529) Spatial: Small optimization searching on indexed non-point shapes

2014-03-16 Thread David Smiley (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5529:
-

Attachment: LUCENE-5529_Skip_redundant_non-point_scanned_cells.patch

In my testing this resulted in 1-3% increase on circles; it'll likely be 
greater for polygon query shapes where it's more expensive to do an 
intersection test.

The patch includes unrelated TODOs on spatial classes for things I want to get 
to in the near future. It also includes a small change to query equality 
(equals & hashcode) such that a tuning parameter isn't included because it 
doesn't change the semantics of the query.

I'll commit this Monday.

> Spatial: Small optimization searching on indexed non-point shapes
> -
>
> Key: LUCENE-5529
> URL: https://issues.apache.org/jira/browse/LUCENE-5529
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-5529_Skip_redundant_non-point_scanned_cells.patch
>
>
> When searching for indexed non-point shapes (such as polygons), there are 
> redundant cells which can be skipped at the bottom "detail level" of the 
> search.  This won't be a problem once LUCENE-4942 is fixed since there then 
> won't be any but it's easy to fix now.
> This affects all predicates RecursivePrefixTreeStrategy uses except Contains.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: JIRA SPAM HELP IN GMAIL

2014-03-16 Thread Furkan KAMACI

Hi Yonik;

I use Gmail too and I've divided dev e-mails with filters and assigned them
labels. I have dev(includes everything), dev-jira, and dev-filtered (does
not include jira e-mails). Also you can filter for VOTE, CONF, JENKINS or
etc too. So it is easy to focus on what you want.

Thanks;
Furkan KAMACI


2014-03-16 15:51 GMT+02:00 Yonik Seeley :

> Forgive the caps... I figured it might show up better amongst the
> flood of JIRA messages.
>
> If you're a gmail user and want to clean up your inbox from the latest
> flood:
> - go to settings, then the "general" tab, and select "conversation view
> off"
> - do a search for "Fix Version/s: (was: 4.7)"(use the quotes!)
> - select "all", then press "select all messages that match this search"
> - archive or delete the messages
> - return to your inbox and the messages should be gone (the search
> view will continue displaying the messages even after they have been
> archived)
> - go back to settings and restore "conversation view on" if that's
> what you started with
>
>
> Any gmail gurus out there know how to select messages (as opposed to
> entire conversations) when in conversation mode?  (basically, any way
> to skip step #1?)
>
>
> -Yonik
> http://heliosearch.org - solve Solr GC pauses with off-heap filters
> and fieldcache
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (SOLR-5838) Relative SolrHome Path Bug At AbstractFullDistribZkTestBase

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937181#comment-13937181
 ] 

ASF subversion and git services commented on SOLR-5838:
---

Commit 1578090 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1578090 ]

SOLR-5838: Relative SolrHome Path Bug At AbstractFullDistribZkTestBase

> Relative SolrHome Path Bug At AbstractFullDistribZkTestBase 
> 
>
> Key: SOLR-5838
> URL: https://issues.apache.org/jira/browse/SOLR-5838
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 4.7
>Reporter: Furkan KAMACI
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5838.patch
>
>
> getRelativeSolrHomePath method at AbstractFullDistribZkTestBase has a control 
> like that:
> {code}
> if (base.startsWith("."));
> base.replaceFirst("\\.", new File(".").getName());
> {code}
> if statement does nothing and result of replaceFirst is ignored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5529) Spatial: Small optimization searching on indexed non-point shapes

2014-03-16 Thread David Smiley (JIRA)

David Smiley created LUCENE-5529:


 Summary: Spatial: Small optimization searching on indexed 
non-point shapes
 Key: LUCENE-5529
 URL: https://issues.apache.org/jira/browse/LUCENE-5529
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8


When searching for indexed non-point shapes (such as polygons), there are 
redundant cells which can be skipped at the bottom "detail level" of the 
search.  This won't be a problem once LUCENE-4942 is fixed since there then 
won't be any but it's easy to fix now.

This affects all predicates RecursivePrefixTreeStrategy uses except Contains.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5838) Relative SolrHome Path Bug At AbstractFullDistribZkTestBase

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5838.
-

Resolution: Fixed
  Assignee: Shalin Shekhar Mangar

Thanks Furkan!

> Relative SolrHome Path Bug At AbstractFullDistribZkTestBase 
> 
>
> Key: SOLR-5838
> URL: https://issues.apache.org/jira/browse/SOLR-5838
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 4.7
>Reporter: Furkan KAMACI
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5838.patch
>
>
> getRelativeSolrHomePath method at AbstractFullDistribZkTestBase has a control 
> like that:
> {code}
> if (base.startsWith("."));
> base.replaceFirst("\\.", new File(".").getName());
> {code}
> if statement does nothing and result of replaceFirst is ignored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5838) Relative SolrHome Path Bug At AbstractFullDistribZkTestBase

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937180#comment-13937180
 ] 

ASF subversion and git services commented on SOLR-5838:
---

Commit 1578089 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1578089 ]

SOLR-5838: Relative SolrHome Path Bug At AbstractFullDistribZkTestBase

> Relative SolrHome Path Bug At AbstractFullDistribZkTestBase 
> 
>
> Key: SOLR-5838
> URL: https://issues.apache.org/jira/browse/SOLR-5838
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 4.7
>Reporter: Furkan KAMACI
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5838.patch
>
>
> getRelativeSolrHomePath method at AbstractFullDistribZkTestBase has a control 
> like that:
> {code}
> if (base.startsWith("."));
> base.replaceFirst("\\.", new File(".").getName());
> {code}
> if statement does nothing and result of replaceFirst is ignored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Use of fix version

2014-03-16 Thread David Smiley (@MITRE.org)

Jack Krupansky-2 wrote
> And maybe there needs to be special formatting to highlight the importance 
> of "Uncheck the box that says "send an email for these changes"."",
> although 
> the omission of that step did highlight the main issue I mentioned.

My error was not neglecting to see that, it was letting JIRA bump the
fix-for versions as part of choosing "Release" menu next to the version in
the version screen.  So I added explicit instructions on the choice on the
wiki.

Sorry again.

To your larger point of fix-for versions... we can't stop users using them
how we might want them to be used, so there's little benefit in trying to
have it reflect some particular meaning (i.e. it really really will be
likely to be done by fix-for version).

~ David




-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Use-of-fix-version-tp4124560p4124561.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Use of fix version

2014-03-16 Thread Jack Krupansky

"all issues with Unresolved Resolution and fixVersion of the release you 
just made, and do a bulk change to the fixVersion to be both the trunk 
version and the next version on the branch you just released from.  Uncheck 
the box that says "send an email for these changes"."


It seems to me that a lot of issues are indicated to be fixed in the next 
dot release when they are in fact very unlikely to be fixed in the next dot 
release - or necessarily in trunk for that matter. Could someone explain the 
rationale? I mean, shouldn't "Fix Version: 4.8" mean that there is a very 
high likelihood of resolution in 4.8, rather than mere wishful thinking or 
bulk bureaucratic assignment? I mean, wouldn't it make more sense to use a 
more "agile" methodology, with most issues being in an unassigned "backlog", 
and being very selective what is targeted for the current "sprint"/next dot 
release? Ditto for trunk - shouldn't it be more selective so that we can see 
how close we are to finishing the high priority issues needed for a trunk 
release?


I mean, how useful is fix version in its current form?

And maybe there needs to be special formatting to highlight the importance 
of "Uncheck the box that says "send an email for these changes"."", although 
the omission of that step did highlight the main issue I mentioned.


-- Jack Krupansky

-Original Message- 
From: Apache Wiki

Sent: Sunday, March 16, 2014 9:06 AM
To: Apache Wiki
Subject: [Lucene-java Wiki] Update of "ReleaseTodo" by DavidSmiley

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" 
for change notification.


The "ReleaseTodo" page has been changed by DavidSmiley:
https://wiki.apache.org/lucene-java/ReleaseTodo?action=diff&rev1=165&rev2=166

Comment:
Releasing in JIRA requires more choosing the "Release" option. And clarify 
to not transition issues then.


 = Post-release =

 == Update JIRA ==
-  1. Go to the JIRA "Manage Versions" Administration pages 
(https://issues.apache.org/jira/plugins/servlet/project-config/LUCENE/versions 
and 
https://issues.apache.org/jira/plugins/servlet/project-config/SOLR/versions), 
click on the release date field for the version you just released, put in 
the release date, and then click the "Update" button.
+  1. Go to the JIRA "Manage Versions" Administration pages 
(https://issues.apache.org/jira/plugins/servlet/project-config/LUCENE/versions 
and 
https://issues.apache.org/jira/plugins/servlet/project-config/SOLR/versions). 
Next to the version you'll release, click the gear pop-up menu icon and 
choose "Release".  It will ask you for the release date -- enter it.  It 
will give the option of transitioning issues marked fix-for the released 
version to the next version, but do '''not''' do this as it will send an 
email for each issue -- we'll address that separately.
-  1. Go to JIRA search in both Solr and Lucene and find all issues that 
were fixed in the release you just made, whose Status is Resolved, and do a 
bulk change to close all of these issues. Uncheck the box that says "send an 
email for these changes".
+  1. Go to JIRA search in both Solr and Lucene and find all issues that 
were fixed in the release you just made, whose Status is Resolved, and do a 
bulk change to close all of these issues (this is a workflow transition 
task). Uncheck the box that says "send an email for these changes".
  1. Do another JIRA search in both Solr and Lucene to find all issues with 
Unresolved Resolution and fixVersion of the release you just made, and do a 
bulk change to the fixVersion to be both the trunk version and the next 
version on the branch you just released from.  Uncheck the box that says 
"send an email for these changes".


 == Don't mirror old releases == 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5550) shards.info is not returned in case of short circuited distributed query

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937177#comment-13937177
 ] 

Shalin Shekhar Mangar commented on SOLR-5550:
-

Oh, just fyi, I made one change before committing the patch. I reduced 
shardCount to 4 from 8. No need to spin up extra replicas here.

> shards.info is not returned in case of short circuited distributed query
> 
>
> Key: SOLR-5550
> URL: https://issues.apache.org/jira/browse/SOLR-5550
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.6
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5550.patch, SOLR-5550.patch
>
>
> Distributed queries which are short circuited and executed locally do not 
> return a shards.info section even when requested.
> Steps to reproduce:
> # cd solr; ant example; cp -r example example2
> # cd example; java -Dbootstrap_confdir=./solr/collection1/conf 
> -Dcollection.configName=conf1 -DzkRun -DnumShards=2 -jar start.jar
> # cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
> # curl 
> http://localhost:8983/solr/admin/collections?action=CREATE&collection=test1&name=test1&numShards=2&collection.configName=conf1&maxShardsPerNode=3
> # Add two docs:
> {code}
> 
>   
> a!1
> xyz
> 2.00
>   
>   
> b!1
> abc
> 5.00
>   
> 
> {code}
> # curl 
> http://localhost:8983/admin/cores?name=test1_shard2_replica2&collection=test1&shard=shard2
> # curl 
> http://localhost:8983/solr/test1_shard2_replica1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will not return shards.info
> # curl 
> http://localhost:7574/solr/test1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will return shards.info



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5550) shards.info is not returned in case of short circuited distributed query

2014-03-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5550.
-

Resolution: Fixed

Thanks Tim!

> shards.info is not returned in case of short circuited distributed query
> 
>
> Key: SOLR-5550
> URL: https://issues.apache.org/jira/browse/SOLR-5550
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.6
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5550.patch, SOLR-5550.patch
>
>
> Distributed queries which are short circuited and executed locally do not 
> return a shards.info section even when requested.
> Steps to reproduce:
> # cd solr; ant example; cp -r example example2
> # cd example; java -Dbootstrap_confdir=./solr/collection1/conf 
> -Dcollection.configName=conf1 -DzkRun -DnumShards=2 -jar start.jar
> # cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
> # curl 
> http://localhost:8983/solr/admin/collections?action=CREATE&collection=test1&name=test1&numShards=2&collection.configName=conf1&maxShardsPerNode=3
> # Add two docs:
> {code}
> 
>   
> a!1
> xyz
> 2.00
>   
>   
> b!1
> abc
> 5.00
>   
> 
> {code}
> # curl 
> http://localhost:8983/admin/cores?name=test1_shard2_replica2&collection=test1&shard=shard2
> # curl 
> http://localhost:8983/solr/test1_shard2_replica1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will not return shards.info
> # curl 
> http://localhost:7574/solr/test1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will return shards.info



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5550) shards.info is not returned in case of short circuited distributed query

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937174#comment-13937174
 ] 

ASF subversion and git services commented on SOLR-5550:
---

Commit 1578083 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1578083 ]

SOLR-5550: shards.info is not returned by a short circuited distributed query

> shards.info is not returned in case of short circuited distributed query
> 
>
> Key: SOLR-5550
> URL: https://issues.apache.org/jira/browse/SOLR-5550
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.6
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5550.patch, SOLR-5550.patch
>
>
> Distributed queries which are short circuited and executed locally do not 
> return a shards.info section even when requested.
> Steps to reproduce:
> # cd solr; ant example; cp -r example example2
> # cd example; java -Dbootstrap_confdir=./solr/collection1/conf 
> -Dcollection.configName=conf1 -DzkRun -DnumShards=2 -jar start.jar
> # cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
> # curl 
> http://localhost:8983/solr/admin/collections?action=CREATE&collection=test1&name=test1&numShards=2&collection.configName=conf1&maxShardsPerNode=3
> # Add two docs:
> {code}
> 
>   
> a!1
> xyz
> 2.00
>   
>   
> b!1
> abc
> 5.00
>   
> 
> {code}
> # curl 
> http://localhost:8983/admin/cores?name=test1_shard2_replica2&collection=test1&shard=shard2
> # curl 
> http://localhost:8983/solr/test1_shard2_replica1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will not return shards.info
> # curl 
> http://localhost:7574/solr/test1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will return shards.info



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5550) shards.info is not returned in case of short circuited distributed query

2014-03-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937172#comment-13937172
 ] 

ASF subversion and git services commented on SOLR-5550:
---

Commit 1578078 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1578078 ]

SOLR-5550: shards.info is not returned by a short circuited distributed query

> shards.info is not returned in case of short circuited distributed query
> 
>
> Key: SOLR-5550
> URL: https://issues.apache.org/jira/browse/SOLR-5550
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.6
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5550.patch, SOLR-5550.patch
>
>
> Distributed queries which are short circuited and executed locally do not 
> return a shards.info section even when requested.
> Steps to reproduce:
> # cd solr; ant example; cp -r example example2
> # cd example; java -Dbootstrap_confdir=./solr/collection1/conf 
> -Dcollection.configName=conf1 -DzkRun -DnumShards=2 -jar start.jar
> # cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
> # curl 
> http://localhost:8983/solr/admin/collections?action=CREATE&collection=test1&name=test1&numShards=2&collection.configName=conf1&maxShardsPerNode=3
> # Add two docs:
> {code}
> 
>   
> a!1
> xyz
> 2.00
>   
>   
> b!1
> abc
> 5.00
>   
> 
> {code}
> # curl 
> http://localhost:8983/admin/cores?name=test1_shard2_replica2&collection=test1&shard=shard2
> # curl 
> http://localhost:8983/solr/test1_shard2_replica1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will not return shards.info
> # curl 
> http://localhost:7574/solr/test1/select?_route_=b!&fl=*&start=0&q=*:*&shards.info=true&collection=test1&rows=10
> # The above will return shards.info



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937169#comment-13937169
 ] 

Erick Erickson commented on LUCENE-1486:


OK, this seems like it's completely obsolete, any objections to closing? Should 
we raise Nikhil Chhaochharia's comment in a new JIRA to test at least?

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.8
>
> Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
> TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

JIRA SPAM HELP IN GMAIL

2014-03-16 Thread Yonik Seeley

Forgive the caps... I figured it might show up better amongst the
flood of JIRA messages.

If you're a gmail user and want to clean up your inbox from the latest flood:
- go to settings, then the "general" tab, and select "conversation view off"
- do a search for "Fix Version/s: (was: 4.7)"(use the quotes!)
- select "all", then press "select all messages that match this search"
- archive or delete the messages
- return to your inbox and the messages should be gone (the search
view will continue displaying the messages even after they have been
archived)
- go back to settings and restore "conversation view on" if that's
what you started with


Any gmail gurus out there know how to select messages (as opposed to
entire conversations) when in conversation mode?  (basically, any way
to skip step #1?)


-Yonik
http://heliosearch.org - solve Solr GC pauses with off-heap filters
and fieldcache

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5869) Analyze Lucene/Solr Project Via Sonar

2014-03-16 Thread Furkan KAMACI (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937165#comment-13937165
 ] 

Furkan KAMACI commented on SOLR-5869:
-

I'll implement a patch for this issue.

> Analyze Lucene/Solr Project Via Sonar
> -
>
> Key: SOLR-5869
> URL: https://issues.apache.org/jira/browse/SOLR-5869
> Project: Solr
>  Issue Type: Task
>Affects Versions: 4.6.1, 4.7
>Reporter: Furkan KAMACI
> Fix For: 4.8
>
>
> Sonar is an open platform used to manage code quality. You can check it from 
> here: http://www.sonarqube.org/
> It would be nice if we analyze Lucene/Solr project with Sonar. You can find a 
> list of Apache Projects that's analyzed via Sonar: 
> https://analysis.apache.org/all_projects?qualifier=TRK
> We should add our project to Sonar instance available at Apache



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5869) Analyze Lucene/Solr Project Via Sonar

2014-03-16 Thread Furkan KAMACI (JIRA)

Furkan KAMACI created SOLR-5869:
---

 Summary: Analyze Lucene/Solr Project Via Sonar
 Key: SOLR-5869
 URL: https://issues.apache.org/jira/browse/SOLR-5869
 Project: Solr
  Issue Type: Task
Affects Versions: 4.7, 4.6.1
Reporter: Furkan KAMACI
 Fix For: 4.8


Sonar is an open platform used to manage code quality. You can check it from 
here: http://www.sonarqube.org/

It would be nice if we analyze Lucene/Solr project with Sonar. You can find a 
list of Apache Projects that's analyzed via Sonar: 
https://analysis.apache.org/all_projects?qualifier=TRK

We should add our project to Sonar instance available at Apache



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Sorry for JIRA spam

2014-03-16 Thread David Smiley (@MITRE.org)

Sorry for all the email spam last night, folks.

I "Released" Lucene & Solr 4.7 in JIRA last night.  I updated the
instructions here
https://wiki.apache.org/lucene-java/ReleaseTodo#Update_JIRA to explicitly
indicate *not* to have JIRA bump the Fix-version values.

~ David



-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorry-for-JIRA-spam-tp4124545.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Any reason Solr Jira still lists 4.7 as unreleased?

2014-03-16 Thread Robert Muir

next time can you please use the option to suppress emails?

On Sun, Mar 16, 2014 at 9:42 AM, David Smiley (@MITRE.org)
 wrote:
> I fixed this last night & this morning.
>
>
> Alexandre Rafalovitch wrote
>> I was doing some searching on issues and noticed 4.7 is listed in
>> "Unreleased versions". Also, I have a couple of open issues that
>> (possibly due to my mistake) are marked as open but target at 4.7.
>>
>> Was not sure if this is a process that normally lags the actual
>> version release or something to notify. So, I am notifying.
>>
>> Regards,
>>Alex.
>> Personal website: http://www.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all
>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> book)
>>
>> -
>> To unsubscribe, e-mail:
>
>> dev-unsubscribe@.apache
>
>> For additional commands, e-mail:
>
>> dev-help@.apache
>
>
>
>
>
> -
>  Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Any-reason-Solr-Jira-still-lists-4-7-as-unreleased-tp4123605p4124544.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Any reason Solr Jira still lists 4.7 as unreleased?

2014-03-16 Thread David Smiley (@MITRE.org)

I fixed this last night & this morning.


Alexandre Rafalovitch wrote
> I was doing some searching on issues and noticed 4.7 is listed in
> "Unreleased versions". Also, I have a couple of open issues that
> (possibly due to my mistake) are marked as open but target at 4.7.
> 
> Was not sure if this is a process that normally lags the actual
> version release or something to notify. So, I am notifying.
> 
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
> 
> -
> To unsubscribe, e-mail: 

> dev-unsubscribe@.apache

> For additional commands, e-mail: 

> dev-help@.apache





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Any-reason-Solr-Jira-still-lists-4-7-as-unreleased-tp4123605p4124544.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reducing the number of warnings in the codebase

2014-03-16 Thread Furkan KAMACI

Hi;

Here is "Apache projects" that is analyzed via Sonar:
https://analysis.apache.org/all_projects?qualifier=TRK

Thanks;
Furkan KAMACI


2014-03-16 15:37 GMT+02:00 Furkan KAMACI :

> hİ;
>
> Thanks Michael. I will open a Jira issue for it.
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-03-16 13:35 GMT+02:00 Michael McCandless :
>
> On Sun, Mar 16, 2014 at 6:09 AM, Furkan KAMACI 
>> wrote:
>>
>> > I've run FindBugs for Lucene/Solr project. If you use Intellij IDEA you
>> can
>> > group the warnings according to their importance. I've opened issues and
>> > attached patches for top level warnings/errors (and some others) that
>> > FindBugs found.
>> >
>> > On the other hand I have another suggestion for Lucene/Solr project.
>> When I
>> > develop or lead projects I use Sonar. It's so good and it runs really
>> nice
>> > open source projects to analyze your code. FindBugs, PMD, Jacoco are
>> just
>> > some of them. It also calculates the method complexities, LoC and etc.
>> You
>> > can see a live example from here:
>> > https://sonar.springsource.org/dashboard/index/4824
>>
>> +1, Sonar looks really nice!
>>
>> > I can be volunteer to integrate Sonar into Lucene/Solr project.
>>
>> Thank you Furkan.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

1 2 >

1 - 100 of 131 matches

Mail list logo