[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2014-03-10 Thread Elran Dvir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930026#comment-13930026
 ] 

Elran Dvir commented on SOLR-2894:
--

No.
It doesn't happen when I use facet.limit=-1 instead of the 
f.fieldname.facet.limit syntax.

Thanks.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.7
>
> Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> dateToObject.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5518) minor hunspell optimizations

2014-03-10 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5518:


Attachment: LUCENE-5518.patch

ok now 3 times faster.

the condition check is moved before applyAffix, and the 
StringBuilder/String/utf8ToString stuff is removed (strips are deduplicated 
into a giant char[]).

Other things to speed this up are more complicated: i dont think this makes it 
too much worse right now.

> minor hunspell optimizations
> 
>
> Key: LUCENE-5518
> URL: https://issues.apache.org/jira/browse/LUCENE-5518
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Robert Muir
> Attachments: LUCENE-5518.patch, LUCENE-5518.patch
>
>
> After benchmarking indexing speed on SOLR-3245, I ran a profiler and a couple 
> things stood out.
> There are other things I want to improve too, but these almost double the 
> speed for many dictionaries.
> * Hunspell supports two-stage affix stripping, but the vast majority of 
> dictionaries don't have any affixes that support it. So we just add a boolean 
> (Dictionary.twoStageAffix) that is false until we see one.
> * We use java.util.regex.Pattern for condition checks. This is slow, I 
> switched to o.a.l.automaton and its much faster, and uses slightly less RAM 
> too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5515) Improve TopDocs#merge for pagination

2014-03-10 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-5515:
--

Attachment: LUCENE-5515.patch

Thanks for taking a look at it Mike, I added a new version of the patch.

> Improve TopDocs#merge for pagination
> 
>
> Key: LUCENE-5515
> URL: https://issues.apache.org/jira/browse/LUCENE-5515
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-5515.patch, LUCENE-5515.patch
>
>
> If TopDocs#merge takes from and size into account it can be optimized to 
> create a hits ScoreDoc array equal to size instead of from+size what is now 
> the case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4968) The collection alias api should have a list cmd.

2014-03-10 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13929918#comment-13929918
 ] 

Shawn Heisey commented on SOLR-4968:


{quote}
Is there currently a workaround to getting this list of aliases?

Where do they get stored? If they only get stored in Zookeeper then how can 
they be backed up in case of a Zookeeper failure where all the config needs to 
be reloaded back into Zookeeper? Would I just have to recreate all of the 
aliases?
{quote}

As far as I know, they are indeed only in zookeeper.  You can see them in the 
Admin UI by clicking the Cloud tab, then Tree, then /aliases.json in the tree 
view.  You should maintain documentation on how you built your SolrCloud and 
Zookeeper configs so you can recreate them if you lose them entirely.

A fully redundant zookeeper ensemble with three or more hosts should keep you 
from encountering a situation where you have to entirely reconstruct the ZK 
database, but but you do bring up a good point - it is always a good idea to 
have actual backups in case of severe bugs, human error, or malicious intent.

Here's some information at the zookeeper level on maintenance and data file 
management:

http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_dataFileManagement

I found these URLs in a Stack Overflow question about backing up zookeeper:

http://stackoverflow.com/questions/6394140/how-do-you-backup-zookeeper


> The collection alias api should have a list cmd.
> 
>
> Key: SOLR-4968
> URL: https://issues.apache.org/jira/browse/SOLR-4968
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.7
>
> Attachments: SOLR-4968.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5518) minor hunspell optimizations

2014-03-10 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5518:


Attachment: LUCENE-5518.patch

Simple patch. I also added some stupid-simple optimizations like a null check 
before doing prefixes/suffixes loops (many dictionaries e.g. only have suffixes)

> minor hunspell optimizations
> 
>
> Key: LUCENE-5518
> URL: https://issues.apache.org/jira/browse/LUCENE-5518
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Robert Muir
> Attachments: LUCENE-5518.patch
>
>
> After benchmarking indexing speed on SOLR-3245, I ran a profiler and a couple 
> things stood out.
> There are other things I want to improve too, but these almost double the 
> speed for many dictionaries.
> * Hunspell supports two-stage affix stripping, but the vast majority of 
> dictionaries don't have any affixes that support it. So we just add a boolean 
> (Dictionary.twoStageAffix) that is false until we see one.
> * We use java.util.regex.Pattern for condition checks. This is slow, I 
> switched to o.a.l.automaton and its much faster, and uses slightly less RAM 
> too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5518) minor hunspell optimizations

2014-03-10 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5518:
---

 Summary: minor hunspell optimizations
 Key: LUCENE-5518
 URL: https://issues.apache.org/jira/browse/LUCENE-5518
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Robert Muir


After benchmarking indexing speed on SOLR-3245, I ran a profiler and a couple 
things stood out.

There are other things I want to improve too, but these almost double the speed 
for many dictionaries.

* Hunspell supports two-stage affix stripping, but the vast majority of 
dictionaries don't have any affixes that support it. So we just add a boolean 
(Dictionary.twoStageAffix) that is false until we see one.
* We use java.util.regex.Pattern for condition checks. This is slow, I switched 
to o.a.l.automaton and its much faster, and uses slightly less RAM too.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Removal of Scorer.weight

2014-03-10 Thread shebiki
GitHub user shebiki opened a pull request:

https://github.com/apache/lucene-solr/pull/40

Removal of Scorer.weight

I've been playing with reducing the dependency between BooleanWeight, 
BooleanScorer, and BooleanScorer2. Instead of letting the scorers generate 
their coord factors, now BooleanWeight generates them in one place for all the 
scorers that it instantiates. After seeing how smoothly that went, I decided to 
try and push it forward a little by removing Scorer.weight and the need to pass 
in a weight (or null) to Scorer's constructor.

The tests green bar with `tests.disableHdfs=true` and `tests.slow=false`.

What do you guys think? I'm happy to adjust and port to trunk if you are 
interested.

--Terry


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shebiki/lucene-solr weight-scorer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit db57c8032c717c06b1244b7e0a560d8a63f969d8
Author: Terry Smith 
Date:   2014-03-10T17:17:55Z

Makes BooleanWeight pass the coord factors into BooleanScorer and
BooleanScorer2. Both now just require a regular Weight instead of a
BooleanWeight.

commit 00740d6287b2f9b32f2c9c83ac6b61fb31f6aa78
Author: Terry Smith 
Date:   2014-03-11T01:16:29Z

Adds an explict ToParentBlockJoinQuery reference to the
ToParentBlockJoinQuery.BlockJoinScorer so ToParentBlockJoinCollector can
retrieve it directly instead of using Scorer.getWeight().

commit 07a41a91bb9b9c9196e6e8b9f0fd505856bc5d23
Author: Terry Smith 
Date:   2014-03-11T02:05:23Z

Removes last two references to Scorer.getWeight().

commit 5b99aafeec3ada0a1e644385d4c52901dc5153d4
Author: Terry Smith 
Date:   2014-03-11T02:31:59Z

Removes Scorer.weight and getWeight().




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3245) Poor performance of Hunspell with Polish Dictionary

2014-03-10 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-3245.
---

   Resolution: Fixed
Fix Version/s: 5.0
   4.8

I've been fixing several bugs in this thing recently for the 4.8 release. I 
don't know what bug was happening here, but I am guessing it mostly involved 
correctness issues (LUCENE-5483) resulting in bad stems, too, which will cause 
crazy search results.

I compared performance of the 4.7 release with the current code in branch_4x 
(to be 4.8). For the corpus I used the first 10k news snippets from the polish 
corpus here: http://www.corpora.heliohost.org/

||Version||Indexing Speed (docs/second)||Number of tokens 
(sumTotalTermFreq)||RAM usage||
|4.7|71.1|635117|50.9MB|
|4.8|909.3|456499|2MB|

So I think the performance issues are fixed. As you can see, this polish 
dictionary was definitely impacted by correctness issues, and this 
over-recursion no longer happens.

> Poor performance of Hunspell with Polish Dictionary
> ---
>
> Key: SOLR-3245
> URL: https://issues.apache.org/jira/browse/SOLR-3245
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.0-ALPHA
> Environment: Centos 6.2, kernel 2.6.32, 2 physical CPU Xeon 5606 (4 
> cores each), 32 GB RAM, 2 SSD disks in RAID 0, java version 1.6.0_26, java 
> settings -server -Xms4096M -Xmx4096M 
>Reporter: Agnieszka
>  Labels: performance
> Fix For: 4.8, 5.0
>
> Attachments: pl_PL.zip
>
>
> In Solr 4.0 Hunspell stemmer with polish dictionary has poor performance 
> whereas performance of hunspell from 
> http://code.google.com/p/lucene-hunspell/ in solr 3.4 is very good. 
> Tests shows:
> Solr 3.4, full import 489017 documents:
> StempelPolishStemFilterFactory -  2908 seconds, 168 docs/sec 
> HunspellStemFilterFactory - 3922 seconds, 125 docs/sec
> Solr 4.0, full import 489017 documents:
> StempelPolishStemFilterFactory - 3016 seconds, 162 docs/sec 
> HunspellStemFilterFactory - 44580 seconds (more than 12 hours), 11 docs/sec
> My schema is quit easy. For Hunspell I have one text field I copy 14 text 
> fields to:
> {code:xml}
> " multiValued="true"/>"
>   
> 
> 
> {code}
> The "text_pl_hunspell" configuration:
> {code:xml}
>  positionIncrementGap="100">
>   
> 
>  ignoreCase="true"
> words="dict/stopwords_pl.txt"
> enablePositionIncrements="true"
> />
> 
>  dictionary="dict/pl_PL.dic" affix="dict/pl_PL.aff" ignoreCase="true"
> 
>   
>   
> 
>  synonyms="dict/synonyms_pl.txt" ignoreCase="true" expand="true"/>
>  ignoreCase="true"
> words="dict/stopwords_pl.txt"
> enablePositionIncrements="true"
> />
> 
>  dictionary="dict/pl_PL.dic" affix="dict/pl_PL.aff" ignoreCase="true"
>  protected="dict/protwords_pl.txt"/>
>   
> 
> {code}
> I use Polish dictionary (files stopwords_pl.txt, protwords_pl.txt, 
> synonyms_pl.txt are empy)- pl_PL.dic, pl_PL.aff. These are the same files I 
> used in 3.4 version. 
> For Polish Stemmer the diffrence is only in definion text field:
> {code}
> " multiValued="true"/>"
>  positionIncrementGap="100">
>   
> 
>  ignoreCase="true"
> words="dict/stopwords_pl.txt"
> enablePositionIncrements="true"
> />
> 
> 
>  protected="dict/protwords_pl.txt"/>
>   
>   
> 
>  synonyms="dict/synonyms_pl.txt" ignoreCase="true" expand="true"/>
>  ignoreCase="true"
> words="dict/stopwords_pl.txt"
> enablePositionIncrements="true"
> />
> 
> 
>  protected="dict/protwords_pl.txt"/>
>   
> 
> {code}
> One document has 23 fields:
> - 14 text fields copy to one text field (above) that is only indexed
> - 8 other indexed fields (2 strings, 2 tdates, 3 tint, 1 tfloat) The size of 
> one document is 3-4 kB.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-10 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5205:


Attachment: LUCENE-5205_improve_stop_word_handling.patch

[~nikhil500] and [~modassar], this patch (based on current lucene5205 branch) 
adds the proposed change.  "calculator for evaluating" is modified to 
"calculator evaluating"~>1 behind the scenes.  I got rid of the option to throw 
a parse exception when encountering a stop word.

I added extra tests in TestOverallSpanQueryParser and TestSpanOnlyParser. 

These mods look good?

> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> ---
>
> Key: LUCENE-5205
> URL: https://issues.apache.org/jira/browse/LUCENE-5205
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.7
>
> Attachments: LUCENE-5205-cleanup-tests.patch, 
> LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
> LUCENE-5205_dateTestReInitPkgPrvt.patch, 
> LUCENE-5205_improve_stop_word_handling.patch, 
> LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, 
> SpanQueryParser_v1.patch.gz, patch.txt
>
>
> This parser extends QueryParserBase and includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> At a high level, there's a first pass BooleanQuery/field parser and then a 
> span query parser handles all terminal nodes and phrases.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
> * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
> * multiple fields: title:lucene author:hatcher
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
> find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
> find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "lucene" or "solr" within three words.
> * Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
> * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
> /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
> words of "ap*che" and that hit has to be within ten words of something like 
> "solr" or that "lucene" regex.
> * Can require at least x number of hits at boolean level: "apache AND (lucene 
> solr tika)~2
> * Can use negative only query: -jakarta :: Find all docs that don't contain 
> "jakarta"
> * Can use an edit distance > 2 for fuzzy query via SlowFuzzyQuery (beware of 
> potential performance issues!).
> Trivial additions:
> * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
> prefix =2)
> * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
> <=2: (jakarta~1 (OSA) vs jakarta~>1(Levenshtein)
> This parser can be very useful for concordance tasks (see also LUCENE-5317 
> and LUCENE-5318) and for analytical search.  
> Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
> Most of the documentation is in the javadoc for SpanQueryParser.
> Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13929808#comment-13929808
 ] 

Robert Muir commented on LUCENE-5487:
-

+1

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5290) Warming up using search logs.

2014-03-10 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5290:
--

Assignee: (was: Mark Miller)

Not sure when I can look at this, so un-assigning for now.

> Warming up using search logs.
> -
>
> Key: SOLR-5290
> URL: https://issues.apache.org/jira/browse/SOLR-5290
> Project: Solr
>  Issue Type: Wish
>  Components: search
>Affects Versions: 4.4
>Reporter: Minoru Osuka
>Priority: Minor
> Attachments: SOLR-5290.patch
>
>
> It is possible to warm up of cache automatically in newSearcher event, but it 
> is impossible to warm up of cache automatically in firstSearcher event 
> because there isn't old searcher.
> We describe queries in solrconfig.xml if we required to cache in 
> firstSearcher event like this:
> {code:xml}
> 
>   
> 
>   static firstSearcher warming in solrconfig.xml
> 
>   
> 
> {code}
> This setting is very statically. I want to query dynamically in firstSearcher 
> event when restart solr. So I paid my attention to the past search log. I 
> think if there are past search logs, it is possible to warm up of cache 
> automatically in firstSearcher event like an autowarming of the cache in 
> newSearcher event.
> I had created QueryLogSenderListener which extended QuerySenderListener.
> Sample definition in solrconfig.xml:
>  - directory : Specify the Solr log directory. (Required)
>  - regex : Describe the regular expression of log. (Required)
>  - encoding : Specify the Solr log encoding. (Default : UTF-8)
>  - count : Specify the number of the log to process. (Default : 100)
>  - paths : Specify the request handler name to process.
>  - exclude_params : Specify the request parameter to except.
> {code:xml}
> 
> 
>   
> 
>   static firstSearcher warming in solrconfig.xml
> 
>   
>   logs
>   UTF-8
>name="regex">
>   
> /select
>   
>   100
>   
> indent
> _
>   
> 
> {code}
> I'd like to propose this feature.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4968) The collection alias api should have a list cmd.

2014-03-10 Thread Bryce Griner (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13929508#comment-13929508
 ] 

Bryce Griner commented on SOLR-4968:


Is there currently a workaround to getting this list of aliases?

Where do they get stored? If they only get stored in Zookeeper then how can 
they be backed up in case of a Zookeeper failure where all the config needs to 
be reloaded back into Zookeeper? Would I just have to recreate all of the 
aliases?

> The collection alias api should have a list cmd.
> 
>
> Key: SOLR-4968
> URL: https://issues.apache.org/jira/browse/SOLR-4968
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.7
>
> Attachments: SOLR-4968.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-5762) SOLR-5658 broke backward compatibility of Javabin format

2014-03-10 Thread Erick Erickson
Hmmm, interesting. Without thinking about it much,
I like the idea of having the ability to know when
something unexpected happens, I wonder how
many developer-hours have been wasted tracking
down rq=blah rather than fq=blah...

Now if you would do a check that would catch it
when I put in df rathern than qf for edismax

Erick

On Mon, Mar 10, 2014 at 2:10 PM, Shawn Heisey  wrote:
> On 3/10/2014 7:20 AM, Erick Erickson wrote:
>> Hmmm, scanning just Noble's comment it's even worse since we have custom
>> components that may define their own params that other components know
>> nothing about (and can't).
>>
>> But I'm glancing at this out of context so may be off in the weeds.
>
> Noble's comment is in response to something I said that came out of left
> field.  It probably shouldn't have been mentioned there, especially on a
> closed issue.
>
> Specifically, I have been assuming for a while now (because of some past
> precedents) that Solr will be moving towards hard failures with any
> unknown parameter.
>
> It sounds like that's not going to actually happen, at least not anytime
> soon.  In order to even be possible in the context of custom components,
> the check would have to happen after all components have consumed their
> options, and the custom code would need to *remove* options from the
> NamedList rather than just read them.
>
> -
>
> It does bring up an improvement idea though - a requestHandler
> configuration option for the handling of unknown parameters.  I
> personally would like to know when our code is using unknown parameters
> so that the code can be updated.
>
> There would be three possible settings.  Setting #1 is just as it is now
> - unknown parameters are completely ignored.  Setting #2 would log a
> warning and process the request.  Setting #3 would fail the request with
> some 4xx HTTP error.
>
> I'm sure there would be a lot of debate about this new option,
> especially on the subject of when/if we need to change the default from
> setting #1.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_51) - Build # 9747 - Failure!

2014-03-10 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9747/
Java: 32bit/jdk1.7.0_51 -client -XX:+UseConcMarkSweepGC

1 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.impl.CloudSolrServerTest.testDistribSearch

Error Message:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 
127.0.0.1:41854 within 45000 ms

Stack Trace:
org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: 
Could not connect to ZooKeeper 127.0.0.1:41854 within 45000 ms
at 
__randomizedtesting.SeedInfo.seed([21E80B1FE6974556:A00E850791C8256A]:0)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:150)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:101)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:91)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:89)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:83)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.setUp(AbstractDistribZkTestBase.java:70)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.setUp(AbstractFullDistribZkTestBase.java:200)
at 
org.apache.solr.client.solrj.impl.CloudSolrServerTest.setUp(CloudSolrServerTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.T

Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Anshum Gupta
>
> [.] Move Lucene/Solr 4.8 (means branch_4x) to Java 7 and backport all Java
> 7-related issues (FileChannel improvements, diamond operator,...).
>

+1 totally!


> [.] Move Lucene/Solr trunk to Java 8 and allow closures in source code.
> This would make some APIs much nicer. Our infrastructure mostly supports
> this, only ECJ Javadoc linting is not yet possible, but forbidden-apis
> supports Java 8 with all its crazy new stuff.
>

-1 (right now)

-- 

Anshum Gupta
http://www.anshumgupta.net


[jira] [Commented] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-10 Thread Upayavira (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926345#comment-13926345
 ] 

Upayavira commented on SOLR-5827:
-

We could add a new method that takes a SolrQueryRequest, and leave the old one 
behind. If we added a null check on req, then the existing method would work, 
but just wouldn't be able to use function queries in parsing queries (as is the 
case now). In that case, it could just throw an exception, saying that the 
request must be passed in in that use case


> Add boosting functionality to MoreLikeThisHandler
> -
>
> Key: SOLR-5827
> URL: https://issues.apache.org/jira/browse/SOLR-5827
> Project: Solr
>  Issue Type: Improvement
>  Components: MoreLikeThis
>Reporter: Upayavira
>Assignee: Tommaso Teofili
> Fix For: 4.8
>
> Attachments: SOLR-5827.patch, SOLR-5827.patch
>
>
> The MoreLikeThisHandler facilitates the creation of a very simple yet 
> powerful recommendation engine. 
> It is possible to constrain the result set using filter queries. However, it 
> isn't possible to influence the scoring using function queries. Adding 
> function query boosting would allow for including such things as recency in 
> the relevancy calculations.
> Unfortunately, the boost= parameter is already in use, meaning we cannot 
> replicate the edismax boost/bf for additive/multiplicative boostings.
> My patch only touches the MoreLikeThisHandler, so the only really contentious 
> thing is to decide the parameters to configure it.
> I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1576096 - in /lucene/dev/branches/lucene5487/lucene: core/src/java/org/apache/lucene/search/ core/src/test/org/apache/lucene/search/ facet/src/java/org/apache/lucene/facet/ facet/src/

2014-03-10 Thread Uwe Schindler
Hi Mike,

Would it not be better to have only one FakeScorer implementation in some 
pkg-private class. This is too much code duplication for me!

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: mikemcc...@apache.org [mailto:mikemcc...@apache.org]
> Sent: Monday, March 10, 2014 10:42 PM
> To: comm...@lucene.apache.org
> Subject: svn commit: r1576096 - in
> /lucene/dev/branches/lucene5487/lucene:
> core/src/java/org/apache/lucene/search/
> core/src/test/org/apache/lucene/search/
> facet/src/java/org/apache/lucene/facet/
> facet/src/java/org/apache/lucene/facet/taxonomy/ grouping/src/java...
> 
> Author: mikemccand
> Date: Mon Mar 10 21:41:44 2014
> New Revision: 1576096
> 
> URL: http://svn.apache.org/r1576096
> Log:
> LUCENE-5487: throw OUE from FakeScorer.getWeight
> 
> Modified:
> 
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/BooleanScorer.java
> 
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/IndexSearcher.java
> 
> lucene/dev/branches/lucene5487/lucene/core/src/test/org/apache/lucene/
> search/TestBooleanScorer.java
> 
> lucene/dev/branches/lucene5487/lucene/facet/src/java/org/apache/lucene
> /facet/DrillSidewaysScorer.java
> 
> lucene/dev/branches/lucene5487/lucene/facet/src/java/org/apache/lucene
> /facet/taxonomy/TaxonomyFacetSumValueSource.java
> 
> lucene/dev/branches/lucene5487/lucene/grouping/src/java/org/apache/luc
> ene/search/grouping/BlockGroupingCollector.java
> 
> lucene/dev/branches/lucene5487/lucene/join/src/java/org/apache/lucene/
> search/join/TermsIncludingScoreQuery.java
> 
> lucene/dev/branches/lucene5487/lucene/join/src/java/org/apache/lucene/
> search/join/ToParentBlockJoinCollector.java
> 
> Modified:
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/BooleanScorer.java
> URL:
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene5487/lucene/cor
> e/src/java/org/apache/lucene/search/BooleanScorer.java?rev=1576096&r1=
> 1576095&r2=1576096&view=diff
> ==
> 
> ---
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/BooleanScorer.java (original)
> +++
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucen
> +++ e/search/BooleanScorer.java Mon Mar 10 21:41:44 2014
> @@ -153,6 +153,11 @@ final class BooleanScorer extends BulkSc
>  public long cost() {
>throw new UnsupportedOperationException();
>  }
> +
> +@Override
> +public Weight getWeight() {
> +  throw new UnsupportedOperationException();
> +}
>}
> 
>static final class Bucket {
> 
> Modified:
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/IndexSearcher.java
> URL:
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene5487/lucene/cor
> e/src/java/org/apache/lucene/search/IndexSearcher.java?rev=1576096&r1=
> 1576095&r2=1576096&view=diff
> ==
> 
> ---
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucene/
> search/IndexSearcher.java (original)
> +++
> lucene/dev/branches/lucene5487/lucene/core/src/java/org/apache/lucen
> +++ e/search/IndexSearcher.java Mon Mar 10 21:41:44 2014
> @@ -805,6 +805,11 @@ public class IndexSearcher {
>public long cost() {
>  return 1;
>}
> +
> +  @Override
> +  public Weight getWeight() {
> +throw new UnsupportedOperationException();
> +  }
>  }
> 
>  private final FakeScorer fakeScorer = new FakeScorer();
> 
> Modified:
> lucene/dev/branches/lucene5487/lucene/core/src/test/org/apache/lucene/
> search/TestBooleanScorer.java
> URL:
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene5487/lucene/cor
> e/src/test/org/apache/lucene/search/TestBooleanScorer.java?rev=1576096
> &r1=1576095&r2=1576096&view=diff
> ==
> 
> ---
> lucene/dev/branches/lucene5487/lucene/core/src/test/org/apache/lucene/
> search/TestBooleanScorer.java (original)
> +++
> lucene/dev/branches/lucene5487/lucene/core/src/test/org/apache/lucen
> +++ e/search/TestBooleanScorer.java Mon Mar 10 21:41:44 2014
> @@ -240,6 +240,11 @@ public class TestBooleanScorer extends L
>  public long cost() {
>throw new UnsupportedOperationException();
>  }
> +
> +@Override
> +public Weight getWeight() {
> +  throw new UnsupportedOperationException();
> +}
>}
> 
>/** Throws UOE if Weight.scorer is called */
> 
> Modified:
> lucene/dev/branches/lucene5487/lucene/facet/src/java/org/apache/lucene
> /facet/DrillSidewaysScorer.java
> URL:
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene5487/lucene/fac
> et/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java?rev=1576096
> &r1=1576095&r2=15760

[jira] [Commented] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926281#comment-13926281
 ] 

Uwe Schindler commented on LUCENE-5495:
---

bq. In FilteredQuery, depending on the FilterStrategy, iterator() 
checked/called after bits(). I think if bits() is optional, and iterator() is 
not, then checking bits() first actually does make sense, otherwise, since 
iterator() impl is mandatory, bits() impl would be ignored. Maybe I am missing 
something...

- RandomAccessFilterStrategy first calls iterator() and cancels collection if 
iterator is null. Then it asks for bits() and uses them, if available, 
otherwise falls back to good old LeapFrog approach. RandomAccessFilterStrategy 
only chooses to use random access, if the useRandomAccess() method returns 
true. In general it only does this if the iterator is not too sparse (it checks 
first filter doc). This can be configured by the user with other heuristics 
(e.g. cost() function).
- LeapFrogFilterStrategy always uses iterator() (it's scorer is also the fall 
back for all other cases, where bits() returns null)
- QueryFirstFilterStrategy first calls bits(), but if those are null it falls 
back to the iterator()

The default is RandomAccessFilterStrategy.

bq. This patch can be think of fixing a shortcoming in BooleanFilter. Are we 
discouraging use of BooleanFiilter?

I just complained about the algo. bits() must be purely optional. If it returns 
null, you *must* also check the iterator(). If the iterator() also returns 
null, no documents match.

But your patch should in no case try to "emulate" the iterator by the 
BitsDocIdSetIterator! iterator() is mandatory and is used as fallback if the 
bits() return null. But definitely not the other way round. iterator() has to 
be implemented, otherwise its not a valid filter.

The UOE by the Facet filters is intentional, because those should never ever be 
used as a filter in queries. Because of the way how FilteredQuery or 
ConstantScoreQuery works, the user will get the UOE. I know this is a hack, but 
[~mikemccand] did this intentionally (earlier versions used the same scorer you 
added in your patch to emulate, but that made users use that filters and slow 
down their queries).

In my opinion, if we want to improve this, it should use a strategy like 
FilteredQuery, too. We can improve maybe depending on cases like:
- all filters are AND'ed together and some support random access -> use the 
same approach like FilteredQuery and pass down the bits of those filter's 
output as acceptDocs input for the next filter
- some flters support random access, few of them only iterators. In that case 
the filter with the iterator could drive the query and use bits() of other 
filters to filter out some documents. The result can be handled as FixedBitSet, 
but improved that bits() is used partially (if available).
- many more strategies...

But on the other hand, as Mike says: Filters should be Queries without score 
and be handled by BooleanQuery, BooleanFilter will go away then (or subsumed 
inside some BooleanQuery optimizations choosen automatically depending on cost 
factors,...). We have a GSoC issue open. This is one of the big priorities 
fixing the broken and unintuitive APIs around Query+Filter.

> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926271#comment-13926271
 ] 

ASF subversion and git services commented on LUCENE-5487:
-

Commit 1576096 from [~mikemccand] in branch 'dev/branches/lucene5487'
[ https://svn.apache.org/r1576096 ]

LUCENE-5487: throw OUE from FakeScorer.getWeight

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926222#comment-13926222
 ] 

John Wang commented on LUCENE-5495:
---

Thanks Uwe for the feedback.

In FilteredQuery, depending on the FilterStrategy, iterator() checked/called 
after bits(). I think if bits() is optional, and iterator() is not, then 
checking bits() first actually does make sense, otherwise, since iterator() 
impl is mandatory, bits() impl would be ignored. Maybe I am missing something...

This patch can be think of fixing a shortcoming in BooleanFilter.

Are we discouraging use of BooleanFiilter?

Thanks for the suggestion with FilteredQuery, that makes sense!


> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5768) Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all fields and skip GET_FIELDS

2014-03-10 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5768:


Attachment: (was: SOLR-5768.diff)

> Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all 
> fields and skip GET_FIELDS
> ---
>
> Key: SOLR-5768
> URL: https://issues.apache.org/jira/browse/SOLR-5768
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5768.diff
>
>
> Suggested by Yonik on solr-user:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg95045.html
> {quote}
> Although it seems like it should be relatively simple to make it work
> with other fields as well, by passing down the complete "fl" requested
> if some optional parameter is set (distrib.singlePass?)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5768) Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all fields and skip GET_FIELDS

2014-03-10 Thread Gregg Donovan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926170#comment-13926170
 ] 

Gregg Donovan commented on SOLR-5768:
-

Thanks, Shalin! You're right -- it's not quite so easy. Here's an updated patch 
with a test that returns multiple fields in a single pass. I'm not sure if it's 
better to work with the fields as represented by rb.rsp.getReturnFields() or by 
rb.req.getParams().get(CommonParams.FL).

> Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all 
> fields and skip GET_FIELDS
> ---
>
> Key: SOLR-5768
> URL: https://issues.apache.org/jira/browse/SOLR-5768
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5768.diff
>
>
> Suggested by Yonik on solr-user:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg95045.html
> {quote}
> Although it seems like it should be relatively simple to make it work
> with other fields as well, by passing down the complete "fl" requested
> if some optional parameter is set (distrib.singlePass?)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5768) Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all fields and skip GET_FIELDS

2014-03-10 Thread Gregg Donovan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregg Donovan updated SOLR-5768:


Attachment: SOLR-5768.diff

Updated patch.

> Add a distrib.singlePass parameter to make EXECUTE_QUERY phase fetch all 
> fields and skip GET_FIELDS
> ---
>
> Key: SOLR-5768
> URL: https://issues.apache.org/jira/browse/SOLR-5768
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5768.diff
>
>
> Suggested by Yonik on solr-user:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg95045.html
> {quote}
> Although it seems like it should be relatively simple to make it work
> with other fields as well, by passing down the complete "fl" requested
> if some optional parameter is set (distrib.singlePass?)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926140#comment-13926140
 ] 

Uwe Schindler commented on LUCENE-5495:
---

By the way: One trick to maybe improve speed if you want chanined AND-Filters. 
There is no need to use BooleanFilter here, just chain FilteredQuery. This will 
automatically (depending on filterstrategy) do exactly what you want:
- use {{new FilteredQuery(new FilteredQuery(query, filter1), filter2)}}
- set a filter strategy according to your needs. If bits() has *faast* 
random access, use the default one. In that case, the resulting bits() instance 
of {{filter1}} will passed as {{acceptDocs}} to {{filter2.getDocIdSet()}} and 
the result of this will be passed as {{acceptDocs}} to query's {{scorer()}}.

But all this depends on your caching needs and performance of your random 
access impls.

> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926140#comment-13926140
 ] 

Uwe Schindler edited comment on LUCENE-5495 at 3/10/14 8:07 PM:


By the way: One trick to maybe improve speed if you want chanined AND-Filters. 
There is no need to use BooleanFilter here, just chain FilteredQuery. This will 
automatically (depending on filterstrategy) do exactly what you want:
- use {{new FilteredQuery(new FilteredQuery(query, filter2), filter1)}}
- set a filter strategy according to your needs. If bits() has *faast* 
random access, use the default one. In that case, the resulting bits() instance 
of {{filter1}} will passed as {{acceptDocs}} to {{filter2.getDocIdSet()}} and 
the result of this will be passed as {{acceptDocs}} to query's {{scorer()}}.

But all this depends on your caching needs and performance of your random 
access impls.


was (Author: thetaphi):
By the way: One trick to maybe improve speed if you want chanined AND-Filters. 
There is no need to use BooleanFilter here, just chain FilteredQuery. This will 
automatically (depending on filterstrategy) do exactly what you want:
- use {{new FilteredQuery(new FilteredQuery(query, filter1), filter2)}}
- set a filter strategy according to your needs. If bits() has *faast* 
random access, use the default one. In that case, the resulting bits() instance 
of {{filter1}} will passed as {{acceptDocs}} to {{filter2.getDocIdSet()}} and 
the result of this will be passed as {{acceptDocs}} to query's {{scorer()}}.

But all this depends on your caching needs and performance of your random 
access impls.

> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5501) Out-of-order collection testing

2014-03-10 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5501:
-

Attachment: LUCENE-5501-2.patch

Thinking about it again, I realized it would be possible to make it better by 
wrapping the collector. This way out-of-order scoring can be tested against a 
larger variety of scorers such as ConstantQuery's scorer (which is currently 
untested because it overrides {{score(Collector)}}).

> Out-of-order collection testing
> ---
>
> Key: LUCENE-5501
> URL: https://issues.apache.org/jira/browse/LUCENE-5501
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
> Fix For: 4.8
>
> Attachments: LUCENE-5501-2.patch, LUCENE-5501.patch
>
>
> Collectors have the ability to declare whether or not they support 
> out-of-order collection, but since most scorers score in order this is not 
> well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926121#comment-13926121
 ] 

ASF subversion and git services commented on LUCENE-5487:
-

Commit 1576066 from [~mikemccand] in branch 'dev/branches/lucene5487'
[ https://svn.apache.org/r1576066 ]

LUCENE-5487: add feedback from Rob

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926111#comment-13926111
 ] 

Uwe Schindler edited comment on LUCENE-5495 at 3/10/14 7:49 PM:


John,
I was about to improve the documentation. The problem is as you explain:
- bits() is mentioning optional
- iterator() does not mention optional, so it is a requirement

And code in FilteredQuery behaves like this:
- Get iterator(), if iterator returns null, no documents match -> exit
- The code then tries to get bits(), if this returns something != null and some 
other conditions apply (depending on FilterStrategy) it switches to pass the 
bits() down low (means as liveDocs to the query)
- otherwise it uses the iterator to leap-frog (or similar) with the query.

ConstantScoreQuery always uses the iterator (because it is a query, that needs 
a scorer, which is a subclass of DocIdSetIterator). bits() are never checked.

BooleanFilter works exactly like that. It addtionally also transforms 
iterator-only filters to bits() filters, because it uses a FixedBitSet to cache.

In addition BooleanFilter uses FixedBitSet.or(Iterator), which shortcuts the 
case, where the iterator is derieved from another FixedBitset.



was (Author: thetaphi):
John,
I was about to improve the documentation. The problem is as you explain:
- bits() is mentioning optional
- iterator() does not mention optional, so it is a requirement

And code in FilteredQuery behaves like this:
- Get iterator(), if iterator returns null, no documents match -> exit
- The code then tries to get bits(), if this returns something != 0 and some 
other conditions apply (depending on FilterStrategy) it switches to pass the 
bits() down low (means as liveDocs to the query)
- otherwise it uses the iterator to leap-frog (or similar) with the query.

ConstantScoreQuery always uses the iterator (because it is a query, that needs 
a scorer, which is a subclass of DocIdSetIterator). bits() are never checked.

BooleanFilter works exactly like that. It addtionally also transforms 
iterator-only filters to bits() filters, because it uses a FixedBitSet to cache.

In addition BooleanFilter uses FixedBitSet.or(Iterator), which shortcuts the 
case, where the iterator is derieved from another FixedBitset.


> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926114#comment-13926114
 ] 

Michael McCandless commented on LUCENE-5487:


Thanks Rob!

bq. In this case no problem, but just as an FYI, if you have trunk/ and 
branch/, you always want to run the differ outside of both (this way the patch 
prefixes are the same: this one cant be applied by any patch tool).

I actually did that at first but for some reason I thought the resulting patch 
file was wrong!  Next time I'll do it like that.

bq. docs for Weight.scoresDocsOutOfOrder() should refer to bulkScorer() instead 
of scorer()

Oh yeah, I'll fix.

bq. a TODO should be added for BooleanWeight.scoresDocsOutOfOrder

I fixed this and added a simple test, on the branch.

bq. should FakeScorer really not take the real Weight anymore? I don't know how 
useful it is, but its wierd that its null, since the Collector actually sees 
this thing via setScorer: if its not going to be supported then i think it 
should override getWeight to explicitly throw UOE?

It seems weird returning a real Weight when everything else is fake, but I 
guess we can just leave it as it was (in BooleanQuery)?  All the other 
FakeScorers seem to do the null Weight thing, but I agree if we do that we 
should just override getWeight to throw UOE.

bq. what is the purpose of a LeapFrogBulkScorer? It seems to just use two 
in-order scorers, i dont understand its purpose. Is this supposed to be a code 
specialization? 

It was there before, but I think it's just code specialization ... I'll just 
nuke it and let Weight.bulkScorer do the default impl.

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926111#comment-13926111
 ] 

Uwe Schindler commented on LUCENE-5495:
---

John,
I was about to improve the documentation. The problem is as you explain:
- bits() is mentioning optional
- iterator() does not mention optional, so it is a requirement

And code in FilteredQuery behaves like this:
- Get iterator(), if iterator returns null, no documents match -> exit
- The code then tries to get bits(), if this returns something != 0 and some 
other conditions apply (depending on FilterStrategy) it switches to pass the 
bits() down low (means as liveDocs to the query)
- otherwise it uses the iterator to leap-frog (or similar) with the query.

ConstantScoreQuery always uses the iterator (because it is a query, that needs 
a scorer, which is a subclass of DocIdSetIterator). bits() are never checked.

BooleanFilter works exactly like that. It addtionally also transforms 
iterator-only filters to bits() filters, because it uses a FixedBitSet to cache.

In addition BooleanFilter uses FixedBitSet.or(Iterator), which shortcuts the 
case, where the iterator is derieved from another FixedBitset.


> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-5501) Out-of-order collection testing

2014-03-10 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reopened LUCENE-5501:
--


> Out-of-order collection testing
> ---
>
> Key: LUCENE-5501
> URL: https://issues.apache.org/jira/browse/LUCENE-5501
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
> Fix For: 4.8
>
> Attachments: LUCENE-5501.patch
>
>
> Collectors have the ability to declare whether or not they support 
> out-of-order collection, but since most scorers score in order this is not 
> well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5502) equals method of TermsFilter might equate two different filters

2014-03-10 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926088#comment-13926088
 ] 

Adrien Grand commented on LUCENE-5502:
--

Thanks for the review, Simon. I'll commit tomorrow if there is no objection 
until then.

> equals method of TermsFilter might equate two different filters
> ---
>
> Key: LUCENE-5502
> URL: https://issues.apache.org/jira/browse/LUCENE-5502
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring
>Affects Versions: 4.7
>Reporter: Igor Motov
> Attachments: LUCENE-5502.patch, LUCENE-5502.patch, LUCENE-5502.patch
>
>
> If two terms filters have 1) the same number of terms, 2) use the same field 
> in all these terms and 3) term values happened to have the same hash codes, 
> these two filter are considered to be equal as long as the first term is the 
> same in both filters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5502) equals method of TermsFilter might equate two different filters

2014-03-10 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926085#comment-13926085
 ] 

Simon Willnauer commented on LUCENE-5502:
-

LGTM as well! thanks for fixing it!

> equals method of TermsFilter might equate two different filters
> ---
>
> Key: LUCENE-5502
> URL: https://issues.apache.org/jira/browse/LUCENE-5502
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring
>Affects Versions: 4.7
>Reporter: Igor Motov
> Attachments: LUCENE-5502.patch, LUCENE-5502.patch, LUCENE-5502.patch
>
>
> If two terms filters have 1) the same number of terms, 2) use the same field 
> in all these terms and 3) term values happened to have the same hash codes, 
> these two filter are considered to be equal as long as the first term is the 
> same in both filters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5422) Postings lists deduplication

2014-03-10 Thread Dmitry Kan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926081#comment-13926081
 ] 

Dmitry Kan edited comment on LUCENE-5422 at 3/10/14 7:27 PM:
-

I agree with [~mikemccand] in that the issue should be better scoped. The case 
with compressing stemmed / non-stemmed terms posting lists is quite tricky and 
requires more thought.

One clear case for this issue is storing reversed term along with its original 
non-reversed version. Both should point to the same posting list (subject to 
some after-stemming-hash-check).

What do you guys think?


was (Author: dmitry_key):
I agree with [~mikemccand] in that the issue should be better scoped. The case 
with compressing stemmed / non-stemmed terms posting lists is quite tricky and 
requires more thought.

One clear case for this issue is storing reversed term along with it is 
original non-reversed version. Both should point to the same posting list 
(subject to some after-stemming-hash-check).

What do you guys think?

> Postings lists deduplication
> 
>
> Key: LUCENE-5422
> URL: https://issues.apache.org/jira/browse/LUCENE-5422
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/index
>Reporter: Dmitry Kan
>  Labels: gsoc2014
>
> The context:
> http://markmail.org/thread/tywtrjjcfdbzww6f
> Robert Muir and I have discussed what Robert eventually named "postings
> lists deduplication" at Berlin Buzzwords 2013 conference.
> The idea is to allow multiple terms to point to the same postings list to
> save space. This can be achieved by new index codec implementation, but this 
> jira is open to other ideas as well.
> The application / impact of this is positive for synonyms, exact / inexact
> terms, leading wildcard support via storing reversed term etc.
> For example, at the moment, when supporting exact (unstemmed) and inexact 
> (stemmed)
> searches, we store both unstemmed and stemmed variant of a word form and
> that leads to index bloating. That is why we had to remove the leading
> wildcard support via reversing a token on index and query time because of
> the same index size considerations.
> Comment from Mike McCandless:
> Neat idea!
> Would this idea allow a single term to point to (the union of) N other
> posting lists?  It seems like that's necessary e.g. to handle the
> exact/inexact case.
> And then, to produce the Docs/AndPositionsEnum you'd need to do the
> merge sort across those N posting lists?
> Such a thing might also be do-able as runtime only wrapper around the
> postings API (FieldsProducer), if you could at runtime do the reverse
> expansion (e.g. stem -> all of its surface forms).
> Comment from Robert Muir:
> I think the exact/inexact is trickier (detecting it would be the hard
> part), and you are right, another solution might work better.
> but for the reverse wildcard and synonyms situation, it seems we could even
> detect it on write if we created some hash of the previous terms postings.
> if the hash matches for the current term, we know it might be a "duplicate"
> and would have to actually do the costly check they are the same.
> maybe there are better ways to do it, but it might be a fun postingformat
> experiment to try.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5422) Postings lists deduplication

2014-03-10 Thread Dmitry Kan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926081#comment-13926081
 ] 

Dmitry Kan commented on LUCENE-5422:


I agree with [~mikemccand] in that the issue should be better scoped. The case 
with compressing stemmed / non-stemmed terms posting lists is quite tricky and 
requires more thought.

One clear case for this issue is storing reversed term along with it is 
original non-reversed version. Both should point to the same posting list 
(subject to some after-stemming-hash-check).

What do you guys think?

> Postings lists deduplication
> 
>
> Key: LUCENE-5422
> URL: https://issues.apache.org/jira/browse/LUCENE-5422
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/index
>Reporter: Dmitry Kan
>  Labels: gsoc2014
>
> The context:
> http://markmail.org/thread/tywtrjjcfdbzww6f
> Robert Muir and I have discussed what Robert eventually named "postings
> lists deduplication" at Berlin Buzzwords 2013 conference.
> The idea is to allow multiple terms to point to the same postings list to
> save space. This can be achieved by new index codec implementation, but this 
> jira is open to other ideas as well.
> The application / impact of this is positive for synonyms, exact / inexact
> terms, leading wildcard support via storing reversed term etc.
> For example, at the moment, when supporting exact (unstemmed) and inexact 
> (stemmed)
> searches, we store both unstemmed and stemmed variant of a word form and
> that leads to index bloating. That is why we had to remove the leading
> wildcard support via reversing a token on index and query time because of
> the same index size considerations.
> Comment from Mike McCandless:
> Neat idea!
> Would this idea allow a single term to point to (the union of) N other
> posting lists?  It seems like that's necessary e.g. to handle the
> exact/inexact case.
> And then, to produce the Docs/AndPositionsEnum you'd need to do the
> merge sort across those N posting lists?
> Such a thing might also be do-able as runtime only wrapper around the
> postings API (FieldsProducer), if you could at runtime do the reverse
> expansion (e.g. stem -> all of its surface forms).
> Comment from Robert Muir:
> I think the exact/inexact is trickier (detecting it would be the hard
> part), and you are right, another solution might work better.
> but for the reverse wildcard and synonyms situation, it seems we could even
> detect it on write if we created some hash of the previous terms postings.
> if the hash matches for the current term, we know it might be a "duplicate"
> and would have to actually do the costly check they are the same.
> maybe there are better ways to do it, but it might be a fun postingformat
> experiment to try.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-10 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926063#comment-13926063
 ] 

Tim Allison commented on LUCENE-5205:
-

{quote}
PhraseQuery does not guarantee that a false hit will be a stop word - if the 
data contains 'calculator xyz evaluating' and we search for "calculator for 
evaluating", then it will match.
{quote}
Ugh.  You are right.  Not sure how I got that wrong.  Thank you.

In the use case with a StopFilter, if we were to go with the proposal above to 
convert to a SpanNearQuery "calculator evaluating"~>1, there could be a false 
hit on "calculator evaluating"...adjacent terms.  But that shouldn't be too 
problematic?

I can't think of any side effects by going with your proposal if you are ok 
with increased index size and query response time.  Out of curiosity, how much 
bigger is your index if you don't use a StopFilter or the SynonymFilter?  

> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> ---
>
> Key: LUCENE-5205
> URL: https://issues.apache.org/jira/browse/LUCENE-5205
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.7
>
> Attachments: LUCENE-5205-cleanup-tests.patch, 
> LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
> LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, 
> LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt
>
>
> This parser extends QueryParserBase and includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> At a high level, there's a first pass BooleanQuery/field parser and then a 
> span query parser handles all terminal nodes and phrases.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
> * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
> * multiple fields: title:lucene author:hatcher
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
> find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
> find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "lucene" or "solr" within three words.
> * Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
> * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
> /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
> words of "ap*che" and that hit has to be within ten words of something like 
> "solr" or that "lucene" regex.
> * Can require at least x number of hits at boolean level: "apache AND (lucene 
> solr tika)~2
> * Can use negative only query: -jakarta :: Find all docs that don't contain 
> "jakarta"
> * Can use an edit distance > 2 for fuzzy query via SlowFuzzyQuery (beware of 
> potential performance issues!).
> Trivial additions:
> * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
> prefix =2)
> * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
> <=2: (jakarta~1 (OSA) vs jakarta~>1(Levenshtein)
> This parser can be very useful for concordance tasks (see also LUCENE-5317 
> and LUCENE-5318) and for analytical search.  
> Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
> Most of the documentation is in the javadoc for SpanQueryParser.
> Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5517) stricter parsing for hunspell parseFlag()

2014-03-10 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5517:


Attachment: LUCENE-5517.patch

> stricter parsing for hunspell parseFlag()
> -
>
> Key: LUCENE-5517
> URL: https://issues.apache.org/jira/browse/LUCENE-5517
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Reporter: Robert Muir
> Attachments: LUCENE-5517.patch
>
>
> I was trying to debug why a hunspell dictionary (an updated version fixes the 
> bug!) used so much ram, and the reason is the dictionary was buggy and didnt 
> have FLAG NUM (so each digit was treated as its own flag, leading to chaos).
> In many situations in the hunspell file (e.g. affix rule), the flag should 
> only be a single one. But today we don't detect this, we just take the first 
> one.
> We should throw exception here: in most cases hunspell itself is doing this 
> for the impacted dictionaries. In these cases the dictionary is buggy and in 
> some cases you do in fact get an error from hunspell commandline. We should 
> throw exception instead of emitting chaos...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5517) stricter parsing for hunspell parseFlag()

2014-03-10 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5517:
---

 Summary: stricter parsing for hunspell parseFlag()
 Key: LUCENE-5517
 URL: https://issues.apache.org/jira/browse/LUCENE-5517
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
 Attachments: LUCENE-5517.patch

I was trying to debug why a hunspell dictionary (an updated version fixes the 
bug!) used so much ram, and the reason is the dictionary was buggy and didnt 
have FLAG NUM (so each digit was treated as its own flag, leading to chaos).

In many situations in the hunspell file (e.g. affix rule), the flag should only 
be a single one. But today we don't detect this, we just take the first one.

We should throw exception here: in most cases hunspell itself is doing this for 
the impacted dictionaries. In these cases the dictionary is buggy and in some 
cases you do in fact get an error from hunspell commandline. We should throw 
exception instead of emitting chaos...




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5460) Allow driving a query by sparse filters

2014-03-10 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926048#comment-13926048
 ] 

Mikhail Khludnev edited comment on LUCENE-5460 at 3/10/14 7:00 PM:
---

LUCENE-5495 
bq. Really, this is all one giant hack/workaround, because Lucene is unable to 
properly/generally handle the "post filter" use case (something Solr has had 
for some time). I think we should fix that; i.e., we need some way for a Filter 
to express that 1) it's random-access (supports Bits), and 2) it's very costly. 

[~mikemccand] let me disturb you with SampleSlowQuery attached, which : 
* implements post-filtering by SlowQueryScorer.confirm(int)
* can be random-access, however, it's not my favor case, I'd like to 
post-filter observing state of underlying leap-frogging scorers 
* allows to handle custom ranking case as well. 

I your feedback is much appreciated! Thanks


was (Author: mkhludnev):
bq. LUCENE-5495 Really, this is all one giant hack/workaround, because Lucene is
unable to properly/generally handle the "post filter" use case
(something Solr has had for some time). I think we should fix that;
i.e., we need some way for a Filter to express that 1) it's random-access
(supports Bits), and 2) it's very costly. 

[~mikemccand] let me disturb you with SampleSlowQuery attached, which : 
* implements post-filtering by SlowQueryScorer.confirm(int)
* can be random-access, however, it's not my favor case, I'd like to 
post-filter observing state of underlying leap-frogging scorers 
* allows to handle custom ranking case as well. 

I your feedback is much appreciated! Thanks

> Allow driving a query by sparse filters
> ---
>
> Key: LUCENE-5460
> URL: https://issues.apache.org/jira/browse/LUCENE-5460
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: TestSlowQuery.java
>
>
> Today if a filter is very sparse we execute the query in sort of a leap-frog 
> manner between the query and filter. If the query is very expensive to 
> compute, and/or matching few docs only too, calling scorer.advance(doc) just 
> to discover the doc it landed on isn't accepted by the filter, is a waste of 
> time. Since Filter is always the "final ruler", I wonder if we had something 
> like {{boolean DISI.advanceExact(doc)}} we could use it instead, in some 
> cases.
> There are many combinations in which I think we'd want to use/not-use this 
> API, and they depend on: Filter's complexity, Filter.cost(), Scorer.cost(), 
> query complexity (span-near, many clauses) etc.
> I open an issue so we can discuss. DISI.advanceExact(doc) is just a 
> preliminary proposal, to get an API we could experiment with. The default 
> implementation should be fairly easy and straightforward, and we could 
> override where we can offer a more optimized imp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_51) - Build # 9745 - Failure!

2014-03-10 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9745/
Java: 32bit/jdk1.7.0_51 -client -XX:+UseG1GC

1 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.impl.CloudSolrServerTest.testDistribSearch

Error Message:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 
127.0.0.1:37487 within 45000 ms

Stack Trace:
org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: 
Could not connect to ZooKeeper 127.0.0.1:37487 within 45000 ms
at 
__randomizedtesting.SeedInfo.seed([3BD92A55A40131B:825B1CBD2D1F7327]:0)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:150)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:101)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:91)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:89)
at 
org.apache.solr.cloud.AbstractZkTestCase.buildZooKeeper(AbstractZkTestCase.java:83)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.setUp(AbstractDistribZkTestBase.java:70)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.setUp(AbstractFullDistribZkTestBase.java:200)
at 
org.apache.solr.client.solrj.impl.CloudSolrServerTest.setUp(CloudSolrServerTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:771)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnor

[jira] [Commented] (LUCENE-5460) Allow driving a query by sparse filters

2014-03-10 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926048#comment-13926048
 ] 

Mikhail Khludnev commented on LUCENE-5460:
--

bq. LUCENE-5495 Really, this is all one giant hack/workaround, because Lucene is
unable to properly/generally handle the "post filter" use case
(something Solr has had for some time). I think we should fix that;
i.e., we need some way for a Filter to express that 1) it's random-access
(supports Bits), and 2) it's very costly. 

[~mikemccand] let me disturb you with SampleSlowQuery attached, which : 
* implements post-filtering by SlowQueryScorer.confirm(int)
* can be random-access, however, it's not my favor case, I'd like to 
post-filter observing state of underlying leap-frogging scorers 
* allows to handle custom ranking case as well. 

I your feedback is much appreciated! Thanks

> Allow driving a query by sparse filters
> ---
>
> Key: LUCENE-5460
> URL: https://issues.apache.org/jira/browse/LUCENE-5460
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Shai Erera
> Attachments: TestSlowQuery.java
>
>
> Today if a filter is very sparse we execute the query in sort of a leap-frog 
> manner between the query and filter. If the query is very expensive to 
> compute, and/or matching few docs only too, calling scorer.advance(doc) just 
> to discover the doc it landed on isn't accepted by the filter, is a waste of 
> time. Since Filter is always the "final ruler", I wonder if we had something 
> like {{boolean DISI.advanceExact(doc)}} we could use it instead, in some 
> cases.
> There are many combinations in which I think we'd want to use/not-use this 
> API, and they depend on: Filter's complexity, Filter.cost(), Scorer.cost(), 
> query complexity (span-near, many clauses) etc.
> I open an issue so we can discuss. DISI.advanceExact(doc) is just a 
> preliminary proposal, to get an API we could experiment with. The default 
> implementation should be fairly easy and straightforward, and we could 
> override where we can offer a more optimized imp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Stalled unit tests

2014-03-10 Thread Dawid Weiss
> Dawid: Boy, those are some large timeouts!

I know... I wasn't the one to bump them; my default was, I think,
about 3 minutes per class...

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2014-03-10 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925995#comment-13925995
 ] 

Chris Russell commented on SOLR-2894:
-

Elran, interesting, does that happen if you use facet.limit=-1 instead of the 
f.fieldname.facet.limit syntax?  I am wondering if some code is checking the 
global limit but not the per-field limit.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.7
>
> Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> dateToObject.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925985#comment-13925985
 ] 

Robert Muir commented on LUCENE-5487:
-

* docs for Weight.scoresDocsOutOfOrder() should refer to bulkScorer() instead 
of scorer()
* a TODO should be added for BooleanWeight.scoresDocsOutOfOrder: its "out of 
sync" in the sense that it only checks for any required clause, but e.g. if 
someone has minNrShouldMatch > 1, it will lie and say its out of order, but 
then go and do in-order scoring (but with a slower collector). This is not new 
and not caused by your patch... and really it would be ideal if somehow the 
logic was in one place rather than duplicated.
* should FakeScorer really not take the real Weight anymore? I don't know how 
useful it is, but its wierd that its null, since the Collector actually sees 
this thing via setScorer: if its not going to be supported then i think it 
should override getWeight to explicitly throw UOE?
* what is the purpose of a LeapFrogBulkScorer? It seems to just use two 
in-order scorers, i dont understand its purpose. Is this supposed to be a code 
specialization? If so, how much faster is it than just... not doing that and 
using an in-order scorer in this case? (it seems maybe the latter way is 
actually faster, as you get the correct collector, same issue as BooleanWeight 
with minNrShouldMatch as mentioned above)


> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Andrzej Bialecki

On 08 Mar 2014, at 17:17, Uwe Schindler  wrote:

> Hi all,
> 
> Java 8 will get released (hopefully, but I trust the release plan!) on March 
> 18, 2014. Because of this, lots of developers will move to Java 8, too. This 
> makes maintaining 3 versions for developing Lucene 4.x not easy anymore 
> (unless you have cool JAVA_HOME "cmd" launcher scripts using StExBar 
> available for your Windows Explorer - or similar stuff in Linux/Mäc).
> 
> We already discussed in another thread about moving to release trunk as 5.0, 
> but people disagreed and preferred to release 4.8 with a minimum of Java 7. 
> This is perfectly fine, as nobody should run Lucene or Solr on an unsupported 
> platform anymore. If they upgrade to 4.8, they should also upgrade their 
> infrastructure - this is a no-brainer. In Lucene trunk we switch to Java 8 as 
> soon as it is released (in 10 days).
> 
> Now the good things: We don't need to support JRockit anymore, no need to 
> support IBM J9 in trunk (unless they release a new version based on Java 8).
> 
> So the vote here is about:
> 
> [.] Move Lucene/Solr 4.8 (means branch_4x) to Java 7 and backport all Java 
> 7-related issues (FileChannel improvements, diamond operator,…).


+1


> [.] Move Lucene/Solr trunk to Java 8 and allow closures in source code. This 
> would make some APIs much nicer. Our infrastructure mostly supports this, 
> only ECJ Javadoc linting is not yet possible, but forbidden-apis supports 
> Java 8 with all its crazy new stuff.


-0 I don’t think Java 8 will reach sufficient penetration by the time of Lucene 
5 release.


> 
> You can vote separately for both items!
> 
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 

--
Best regards,
Andrzej Bialecki

--=# http://www.lucidworks.com #=--


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-5762) SOLR-5658 broke backward compatibility of Javabin format

2014-03-10 Thread Shawn Heisey
On 3/10/2014 7:20 AM, Erick Erickson wrote:
> Hmmm, scanning just Noble's comment it's even worse since we have custom
> components that may define their own params that other components know
> nothing about (and can't).
> 
> But I'm glancing at this out of context so may be off in the weeds.

Noble's comment is in response to something I said that came out of left
field.  It probably shouldn't have been mentioned there, especially on a
closed issue.

Specifically, I have been assuming for a while now (because of some past
precedents) that Solr will be moving towards hard failures with any
unknown parameter.

It sounds like that's not going to actually happen, at least not anytime
soon.  In order to even be possible in the context of custom components,
the check would have to happen after all components have consumed their
options, and the custom code would need to *remove* options from the
NamedList rather than just read them.

-

It does bring up an improvement idea though - a requestHandler
configuration option for the handling of unknown parameters.  I
personally would like to know when our code is using unknown parameters
so that the code can be updated.

There would be three possible settings.  Setting #1 is just as it is now
- unknown parameters are completely ignored.  Setting #2 would log a
warning and process the request.  Setting #3 would fail the request with
some 4xx HTTP error.

I'm sure there would be a lot of debate about this new option,
especially on the subject of when/if we need to change the default from
setting #1.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5473) Make one state.json per collection

2014-03-10 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5473:
-

Attachment: SOLR-5473-74.patch

Catching up with the ever changing trunk.


> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925941#comment-13925941
 ] 

Varun Thacker commented on SOLR-5837:
-

[~hakeber] Looks like you cleaned up code from both this issue and SOLR-5265. 
Thanks :)
One small nit - BIN_FILE_LOCATION in TestJavaBinCodec is not static

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925932#comment-13925932
 ] 

Robert Muir commented on LUCENE-5487:
-

In this case no problem, but just as an FYI, if you have trunk/ and branch/, 
you always want to run the differ outside of both (this way the patch prefixes 
are the same: this one cant be applied by any patch tool).

But this is no problem for me to review, i can just switch to your branch to 
see the context!

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5495) Boolean Filter does not handle FilterClauses with only bits() implemented

2014-03-10 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925923#comment-13925923
 ] 

John Wang commented on LUCENE-5495:
---

Hi Uwe:

Looking at the Filter doc, I don't see the stated contract to always check 
iterator before bits. It does say however, bits is not always implemented, and 
if it is, indicates this has random access.

In the current BooleanFilter implementation, it is essentially converting 
iterators from Filters into a FixedBitSet by iterating. So in the case with 
Filters with a forward index backing, it is scanning the entire index, e.g. 0 
to maxDoc for each filter clause. So this patch checks for bits() and treat 
them differently, if bits() returning null, then checks for iterator, this 
logic does follow the contract as it states, bits is not always implemented, 
but iterator must be. So IMO, even if the facets filter do have iterator() 
implemented, this is still an optimization.

 -John

> Boolean Filter does not handle FilterClauses with only bits() implemented
> -
>
> Key: LUCENE-5495
> URL: https://issues.apache.org/jira/browse/LUCENE-5495
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.6.1
>Reporter: John Wang
> Attachments: LUCENE-5495.patch, LUCENE-5495.patch
>
>
> Some Filter implementations produce DocIdSets without the iterator() 
> implementation, such as o.a.l.facet.range.Range.getFilter().
> Currently, such filters cannot be added to a BooleanFilter because 
> BooleanFilter expects all FilterClauses with Filters that have iterator() 
> implemented.
> This patch improves the behavior by taking Filters with bits() implemented 
> and treat them separately.
> This behavior would be faster in the case for Filters with a forward index as 
> the underlying data structure, where there would be no need to scan the index 
> to build an iterator.
> See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925920#comment-13925920
 ] 

Mark Miller commented on SOLR-1632:
---

bq.  I tried your and few older patches again but docCounts are no longer the 
sum of the cluster size. 

Do you see what is missing in the tests to catch this?

> Distributed IDF
> ---
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.5
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 4.7, 5.0
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
> distrib.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across 
> non-uniform shards. This issue tracks the proposed implementation of an API 
> to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5502) equals method of TermsFilter might equate two different filters

2014-03-10 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925918#comment-13925918
 ] 

Adrien Grand commented on LUCENE-5502:
--

Thank you Igor, the patch looks good to me! [~simonw], I think you worked on 
this filter in the past, so you might want to give a look at this patch?

> equals method of TermsFilter might equate two different filters
> ---
>
> Key: LUCENE-5502
> URL: https://issues.apache.org/jira/browse/LUCENE-5502
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring
>Affects Versions: 4.7
>Reporter: Igor Motov
> Attachments: LUCENE-5502.patch, LUCENE-5502.patch, LUCENE-5502.patch
>
>
> If two terms filters have 1) the same number of terms, 2) use the same field 
> in all these terms and 3) term values happened to have the same hash codes, 
> these two filter are considered to be equal as long as the first term is the 
> same in both filters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5502) equals method of TermsFilter might equate two different filters

2014-03-10 Thread Igor Motov (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Motov updated LUCENE-5502:
---

Attachment: LUCENE-5502.patch

Updated patch with ArrayUtil.equals

> equals method of TermsFilter might equate two different filters
> ---
>
> Key: LUCENE-5502
> URL: https://issues.apache.org/jira/browse/LUCENE-5502
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring
>Affects Versions: 4.7
>Reporter: Igor Motov
> Attachments: LUCENE-5502.patch, LUCENE-5502.patch, LUCENE-5502.patch
>
>
> If two terms filters have 1) the same number of terms, 2) use the same field 
> in all these terms and 3) term values happened to have the same hash codes, 
> these two filter are considered to be equal as long as the first term is the 
> same in both filters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5837.
---

Resolution: Fixed

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925908#comment-13925908
 ] 

ASF subversion and git services commented on SOLR-5837:
---

Commit 1576005 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1576005 ]

SOLR-5837: Clean up issue: Add hashCode/equals to SolrDocument, 
SolrInputDocument and SolrInputField for testing purposes.

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925904#comment-13925904
 ] 

ASF subversion and git services commented on SOLR-5837:
---

Commit 1576004 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1576004 ]

SOLR-5837: Clean up issue: Add hashCode/equals to SolrDocument, 
SolrInputDocument and SolrInputField for testing purposes.

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5512) Remove redundant typing (diamond operator) in trunk

2014-03-10 Thread Furkan KAMACI (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925900#comment-13925900
 ] 

Furkan KAMACI commented on LUCENE-5512:
---

Solr module is OK. I will test it and attach whole patch.

> Remove redundant typing (diamond operator) in trunk
> ---
>
> Key: LUCENE-5512
> URL: https://issues.apache.org/jira/browse/LUCENE-5512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-5512.patch, LUCENE-5512.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5800) Admin UI - Analysis form doesn't render results correctly when a CharFilter is used.

2014-03-10 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925890#comment-13925890
 ] 

Stefan Matheis (steffkes) commented on SOLR-5800:
-

[~tim.potter] did you have a chance? otherwise i would commit that one tomorrow

> Admin UI - Analysis form doesn't render results correctly when a CharFilter 
> is used.
> 
>
> Key: SOLR-5800
> URL: https://issues.apache.org/jira/browse/SOLR-5800
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 4.7
>Reporter: Timothy Potter
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5800-sample.json, SOLR-5800.patch
>
>
> I have an example in Solr In Action that uses the
> PatternReplaceCharFilterFactory and now it doesn't work in 4.7.0.
> Specifically, the  is:
>  positionIncrementGap="100">
>   
>  pattern="([a-zA-Z])\1+"
> replacement="$1$1"/>
> 
>  generateWordParts="1"
> splitOnCaseChange="0"
> splitOnNumerics="0"
> stemEnglishPossessive="1"
> preserveOriginal="0"
> catenateWords="1"
> generateNumberParts="1"
> catenateNumbers="0"
> catenateAll="0"
> types="wdfftypes.txt"/>
>  ignoreCase="true"
> words="lang/stopwords_en.txt"
> />
> 
> 
> 
>   
> 
> The PatternReplaceCharFilterFactory (PRCF) is used to collapse
> repeated letters in a term down to a max of 2, such as #yu would
> be #yumm
> When I run some text through this analyzer using the Analysis form,
> the output is as if the resulting text is unavailable to the
> tokenizer. In other words, the only results being displayed in the
> output on the form is for the PRCF
> This example stopped working in 4.7.0 and I've verified it worked
> correctly in 4.6.1.
> Initially, I thought this might be an issue with the actual analysis,
> but the analyzer actually works when indexing / querying. Then,
> looking at the JSON response in the Developer console with Chrome, I
> see the JSON that comes back includes output for all the components in
> my chain (see below) ... so looks like a UI rendering issue to me?
> {"responseHeader":{"status":0,"QTime":24},"analysis":{"field_types":{"text_microblog":{"index":["org.apache.lucene.analysis.pattern.PatternReplaceCharFilter","#Yumm
> :) Drinking a latte at Caffe Grecco in SF's historic North Beach...
> Learning text analysis with #SolrInAction by @ManningBooks on my i-Pad
> foo5","org.apache.lucene.analysis.core.WhitespaceTokenizer",[{"text":"#Yumm","raw_bytes":"[23
> 59 75 6d 
> 6d]","start":0,"end":6,"position":1,"positionHistory":[1],"type":"word"},{"text":":)","raw_bytes":"[3a
> 29]","start":7,"end":9,"position":2,"positionHistory":[2],"type":"word"},{"text":"Drinking","raw_bytes":"[44
> 72 69 6e 6b 69 6e
> 67]","start":10,"end":18,"position":3,"positionHistory":[3],"type":"word"},{"text":"a","raw_bytes":"[61]","start":19,"end":20,"position":4,"positionHistory":[4],"type":"word"},{"text":"latte","raw_bytes":"[6c
>  ...
> the JSON returned to the browser has evidence that the full analysis chain 
> was applied, so this seems to just be a rendering issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3178) Native MMapDir

2014-03-10 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3178:
---

Attachment: LUCENE-3178.patch

bq. It do think it'd be interesting to pair up a NativeMMapDir with a custom 
postings format that instead uses IndexInput.readLong (via Unsafe.getLong) to 
pull longs from disk

I was curious about this so I coded up a prototype patch.  It's a
NativeMMapDirectory.java/cpp that does the mmap/munmap in C, and then
a new postings format (NativeMMapPostingsFormat) which requires this
Directory impl and then uses Unsafe.getLong to read the longs for
packed int decode.

This bypasses the extra step we do today of first reading into a
byte[], and then decoding from that, and instead pulls long directly
from the map and decodes from that.  It requires that the byte-order
in the index matches the CPU; e.g. for x86 (little-endian) it's
opposite from the big-endian order that DataInput.write/readLong
expect.

It does not align the long reads; doing so would increase the index
size somewhat because we'd need to insert pad bytes to align the long
reads to every 8 bytes.  But I think on recent x86 CPUs unaligned
reads are not adding much of a penalty...

The patch is very unsafe / tons of nocommits, but seems to work
correctly.  Here's the results:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
  Fuzzy2   47.61  (3.1%)   46.98  (2.9%)   
-1.3% (  -7% -4%)
HighSpanNear8.34  (5.8%)8.42  (5.9%)
0.9% ( -10% -   13%)
 Respell   48.79  (4.1%)   50.00  (3.3%)
2.5% (  -4% -   10%)
  IntNRQ3.68  (1.5%)3.78  (7.8%)
2.7% (  -6% -   12%)
OrHighNotMed   37.79  (3.8%)   38.90  (2.8%)
3.0% (  -3% -9%)
OrHighNotLow   31.19  (4.2%)   32.13  (3.3%)
3.0% (  -4% -   10%)
 Prefix3   91.92  (1.9%)   95.11  (6.2%)
3.5% (  -4% -   11%)
   OrHighMed   32.99  (4.0%)   34.15  (3.1%)
3.5% (  -3% -   11%)
  Fuzzy1   60.40  (3.3%)   62.56  (3.4%)
3.6% (  -3% -   10%)
   OrNotHighHigh   11.17  (3.9%)   11.57  (2.7%)
3.6% (  -2% -   10%)
HighTerm   69.60 (11.2%)   72.19 (15.5%)
3.7% ( -20% -   34%)
   LowPhrase   13.17  (2.1%)   13.67  (2.7%)
3.8% (   0% -8%)
  AndHighMed   34.52  (1.0%)   35.85  (1.5%)
3.8% (   1% -6%)
OrNotHighLow   25.04  (3.5%)   26.00  (0.4%)
3.8% (   0% -8%)
   OrHighLow   23.60  (4.2%)   24.50  (3.3%)
3.8% (  -3% -   11%)
Wildcard   19.93  (2.8%)   20.73  (5.0%)
4.0% (  -3% -   12%)
 MedSloppyPhrase3.52  (3.8%)3.67  (4.5%)
4.2% (  -3% -   12%)
   OrHighNotHigh   13.88  (3.7%)   14.46  (2.5%)
4.2% (  -1% -   10%)
  OrHighHigh   10.23  (3.9%)   10.68  (3.1%)
4.4% (  -2% -   11%)
 LowTerm  330.50  (6.7%)  345.35  (8.9%)
4.5% ( -10% -   21%)
 AndHighHigh   28.53  (1.1%)   29.82  (1.4%)
4.5% (   2% -7%)
OrNotHighMed   24.13  (3.4%)   25.23  (0.5%)
4.6% (   0% -8%)
 LowSpanNear   10.55  (2.7%)   11.06  (3.6%)
4.8% (  -1% -   11%)
  HighPhrase4.30  (6.7%)4.55  (6.2%)
5.9% (  -6% -   20%)
 MedTerm  106.81  (9.0%)  113.26 (12.9%)
6.0% ( -14% -   30%)
HighSloppyPhrase3.41  (4.2%)3.67  (7.2%)
7.7% (  -3% -   19%)
 MedSpanNear   31.66  (3.0%)   34.15  (3.8%)
7.9% (   0% -   15%)
   MedPhrase  212.86  (6.1%)  233.14  (6.1%)
9.5% (  -2% -   23%)
 LowSloppyPhrase   44.91  (2.4%)   49.77  (2.3%)   
10.8% (   6% -   15%)
  AndHighLow  404.75  (2.5%)  506.81  (3.6%)   
25.2% (  18% -   32%)
{noformat}

Net net, a very minor improvement!  I think this is good news: it
means that the extra abstractions here, which are useful so we can be
safe (not use Unsafe) and agnostic to byte-order are not costing us
too much.


> Native MMapDir
> --
>
> Key: LUCENE-3178
> URL: https://issues.apache.org/jira/browse/LUCENE-3178
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
>  Labels: gsoc201

[jira] [Assigned] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5837:
-

Assignee: Mark Miller  (was: Noble Paul)

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925887#comment-13925887
 ] 

Mark Miller commented on SOLR-5837:
---

The following also should be addressed:

{noformat}
+ } catch (IOException e) { 
+ //TODO fail test? 
+ } 
{noformat}

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Noble Paul
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-5837:
---


> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Noble Paul
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5837) Add missing equals implementation for SolrDocument, SolrInputDocument and SolrInputField.

2014-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925884#comment-13925884
 ] 

Mark Miller commented on SOLR-5837:
---

Yonik's comment still needs to be addressed - the doc for the new hashcode / 
equals methods needs to explain its limitations and that they are just for 
testing purposes.

> Add missing equals implementation for SolrDocument, SolrInputDocument and 
> SolrInputField.
> -
>
> Key: SOLR-5837
> URL: https://issues.apache.org/jira/browse/SOLR-5837
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Noble Paul
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5837.patch, SOLR-5837.patch
>
>
> While working on SOLR-5265 I tried comparing objects of SolrDocument, 
> SolrInputDocument and SolrInputField. These classes did not Override the 
> equals implementation. 
> The issue will Override equals and hashCode methods to the 3 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-10 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925883#comment-13925883
 ] 

Tommaso Teofili commented on SOLR-5827:
---

Having had a look at the patch and it generally looks good.
One concern about backward compatibility: the patch changes the constructor of 
MoreLikeThisHelper from MoreLikeThisHelper(SolrParams, SolrIndexSearcher) to 
MoreLikeThisHelper(SolrParams, SolrIndexSearcher, SolrQueryRequest); while 
that's an helper class it's public (it's used by MoreLikeThisComponent and so 
cannot be made package local / protected) and therefore could be used 
externally so that this change may break existing client code using that.

> Add boosting functionality to MoreLikeThisHandler
> -
>
> Key: SOLR-5827
> URL: https://issues.apache.org/jira/browse/SOLR-5827
> Project: Solr
>  Issue Type: Improvement
>  Components: MoreLikeThis
>Reporter: Upayavira
>Assignee: Tommaso Teofili
> Fix For: 4.8
>
> Attachments: SOLR-5827.patch, SOLR-5827.patch
>
>
> The MoreLikeThisHandler facilitates the creation of a very simple yet 
> powerful recommendation engine. 
> It is possible to constrain the result set using filter queries. However, it 
> isn't possible to influence the scoring using function queries. Adding 
> function query boosting would allow for including such things as recency in 
> the relevancy calculations.
> Unfortunately, the boost= parameter is already in use, meaning we cannot 
> replicate the edismax boost/bf for additive/multiplicative boostings.
> My patch only touches the MoreLikeThisHandler, so the only really contentious 
> thing is to decide the parameters to configure it.
> I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5783) Can we stop opening a new searcher when the index hasn't changed?

2014-03-10 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925878#comment-13925878
 ] 

Yonik Seeley commented on SOLR-5783:


bq. I'm not really in a position to commit anything over the next few days

No worries...
There are definitely two different bugs caused by this that I see, but they 
should both be easy to fix (and I think that's probably the easiest way forward 
at this point.)  I'll try to get to it soon.


> Can we stop opening a new searcher when the index hasn't changed?
> -
>
> Key: SOLR-5783
> URL: https://issues.apache.org/jira/browse/SOLR-5783
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5783.patch, SOLR-5783.patch, SOLR-5783.patch, 
> SOLR-5783.patch
>
>
> I've been thinking recently about how/when we re-open searchers -- and what 
> the overhead of that is in terms of caches and what not -- even if the 
> underlying index hasn't changed.  
> The particular real world case that got me thinking about this recently is 
> when a deleteByQuery gets forwarded to all shards in a collection, and then 
> the subsequent (soft)Commit (either auto or explicit) opens a new searcher -- 
> even if that shard was completley uneffected by the delete.
> It got me wondering: why don't re-use the same searcher when the index is 
> unchanged?
> From what I can tell, we're basically 99% of the way there (in 
> {{}})...
> * IndexWriter.commit is already smart enough to short circut if there's 
> nothing to commit
> * SolrCore.openNewSearcher already uses DirectoryReader.openIfChanged to see 
> if the reader can be re-used.
> * for "realtime" purposes, SolrCore.openNewSearcher will return the existing 
> searcher if it exists and the DirectoryReader hasn't changed
> ...The only reason I could think of for not _always_ re-using the same 
> searcher when the underlying DirectoryReader is identical (ie: that last 
> bullet above) is in the situation where the "live" schema has changed -- but 
> that seems pretty trivial to account for.
> Is there any other reason why this wouldn't be a good idea for improving 
> performance?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5422) Postings lists deduplication

2014-03-10 Thread Vishmi Money (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925873#comment-13925873
 ] 

Vishmi Money commented on LUCENE-5422:
--

I'm sharing the draft of my proposal with you. I will be grateful if you can 
review it and give me feedback.

link : 
https://docs.google.com/document/d/1CWw_mCD9Qv7VcskFbZg4PpRHG_Trh4GNPuCt9pwPGKg/edit?usp=sharing

> Postings lists deduplication
> 
>
> Key: LUCENE-5422
> URL: https://issues.apache.org/jira/browse/LUCENE-5422
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/index
>Reporter: Dmitry Kan
>  Labels: gsoc2014
>
> The context:
> http://markmail.org/thread/tywtrjjcfdbzww6f
> Robert Muir and I have discussed what Robert eventually named "postings
> lists deduplication" at Berlin Buzzwords 2013 conference.
> The idea is to allow multiple terms to point to the same postings list to
> save space. This can be achieved by new index codec implementation, but this 
> jira is open to other ideas as well.
> The application / impact of this is positive for synonyms, exact / inexact
> terms, leading wildcard support via storing reversed term etc.
> For example, at the moment, when supporting exact (unstemmed) and inexact 
> (stemmed)
> searches, we store both unstemmed and stemmed variant of a word form and
> that leads to index bloating. That is why we had to remove the leading
> wildcard support via reversing a token on index and query time because of
> the same index size considerations.
> Comment from Mike McCandless:
> Neat idea!
> Would this idea allow a single term to point to (the union of) N other
> posting lists?  It seems like that's necessary e.g. to handle the
> exact/inexact case.
> And then, to produce the Docs/AndPositionsEnum you'd need to do the
> merge sort across those N posting lists?
> Such a thing might also be do-able as runtime only wrapper around the
> postings API (FieldsProducer), if you could at runtime do the reverse
> expansion (e.g. stem -> all of its surface forms).
> Comment from Robert Muir:
> I think the exact/inexact is trickier (detecting it would be the hard
> part), and you are right, another solution might work better.
> but for the reverse wildcard and synonyms situation, it seems we could even
> detect it on write if we created some hash of the previous terms postings.
> if the hash matches for the current term, we know it might be a "duplicate"
> and would have to actually do the costly check they are the same.
> maybe there are better ways to do it, but it might be a fun postingformat
> experiment to try.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5517) Return HTTP error on POST requests with no Content-Type

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925869#comment-13925869
 ] 

Uwe Schindler edited comment on SOLR-5517 at 3/10/14 4:38 PM:
--

Hi,
can you open a new issue to fix the Admin UI? The admin UI may use POST, but it 
must send content-type. I think [~steffkes] should fix this, by passing the 
"application/json" content type.
Ideally this should (as you tell) send something in the body, otherwise POST is 
not the right HTTP method to use.


was (Author: thetaphi):
Hi,
can you open a new issue to fix the Admin UI? The admin UI may use POST, but it 
must send content-type. I think [~steffkes] should fix this, by passing the 
"application/json" content type.

> Return HTTP error on POST requests with no Content-Type
> ---
>
> Key: SOLR-5517
> URL: https://issues.apache.org/jira/browse/SOLR-5517
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Assignee: Ryan Ernst
> Fix For: 4.7, 5.0
>
> Attachments: SOLR-5517.patch, SOLR-5517.patch, SOLR-5517.patch, 
> SOLR-5517.patch, SOLR-5517.patch
>
>
> While the http spec states requests without a content-type should be treated 
> as application/octet-stream, the html spec says instead that post requests 
> without a content-type should be treated as a form 
> (http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1).  It would be 
> nice to allow large search requests from html forms, and not have to rely on 
> the browser to set the content type (since the spec says it doesn't have to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5517) Return HTTP error on POST requests with no Content-Type

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925869#comment-13925869
 ] 

Uwe Schindler commented on SOLR-5517:
-

Hi,
can you open a new issue to fix the Admin UI? The admin UI may use POST, but it 
must send content-type. I think [~steffkes] should fix this, by passing the 
"application/json" content type.

> Return HTTP error on POST requests with no Content-Type
> ---
>
> Key: SOLR-5517
> URL: https://issues.apache.org/jira/browse/SOLR-5517
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Assignee: Ryan Ernst
> Fix For: 4.7, 5.0
>
> Attachments: SOLR-5517.patch, SOLR-5517.patch, SOLR-5517.patch, 
> SOLR-5517.patch, SOLR-5517.patch
>
>
> While the http spec states requests without a content-type should be treated 
> as application/octet-stream, the html spec says instead that post requests 
> without a content-type should be treated as a form 
> (http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1).  It would be 
> nice to allow large search requests from html forms, and not have to rely on 
> the browser to set the content type (since the spec says it doesn't have to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-10 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili reassigned SOLR-5827:
-

Assignee: Tommaso Teofili

> Add boosting functionality to MoreLikeThisHandler
> -
>
> Key: SOLR-5827
> URL: https://issues.apache.org/jira/browse/SOLR-5827
> Project: Solr
>  Issue Type: Improvement
>  Components: MoreLikeThis
>Reporter: Upayavira
>Assignee: Tommaso Teofili
> Fix For: 4.8
>
> Attachments: SOLR-5827.patch, SOLR-5827.patch
>
>
> The MoreLikeThisHandler facilitates the creation of a very simple yet 
> powerful recommendation engine. 
> It is possible to constrain the result set using filter queries. However, it 
> isn't possible to influence the scoring using function queries. Adding 
> function query boosting would allow for including such things as recency in 
> the relevancy calculations.
> Unfortunately, the boost= parameter is already in use, meaning we cannot 
> replicate the edismax boost/bf for additive/multiplicative boostings.
> My patch only touches the MoreLikeThisHandler, so the only really contentious 
> thing is to decide the parameters to configure it.
> I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5515) Improve TopDocs#merge for pagination

2014-03-10 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925856#comment-13925856
 ] 

Michael McCandless commented on LUCENE-5515:


+1

It's nice that ElasticSearch is trying to use TopDocs.merge here :)

Seems like this:

bq. if (availHitCount < start) {

Could be <= instead?  Ie, the == case is still 0 hits returned?

Maybe move the entire while loop into the "else"?  And move
numIterOnHits into the else too.

The javadocs state that the returned scoreDocs will have length always
equal to size, but that's only true if there were enough hits right?
Maybe change it to "at most size"?


> Improve TopDocs#merge for pagination
> 
>
> Key: LUCENE-5515
> URL: https://issues.apache.org/jira/browse/LUCENE-5515
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 4.8
>
> Attachments: LUCENE-5515.patch
>
>
> If TopDocs#merge takes from and size into account it can be optimized to 
> create a hits ScoreDoc array equal to size instead of from+size what is now 
> the case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5517) Return HTTP error on POST requests with no Content-Type

2014-03-10 Thread Paco Garcia (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925849#comment-13925849
 ] 

Paco Garcia edited comment on SOLR-5517 at 3/10/14 4:26 PM:


Quick hack:
go to:
 webapps\solr\js\scripts\dataimport.js 
and change POST by GET :

  $.ajax
  (
{
  url : handler_url + '?command=abort&wt=json',
  dataType : 'json',
  type: 'GET',<
  context: $( this ),
  beforeSend : function( xhr, settings )
  {
span_element
  .addClass( 'loader' );
  },
 OR put something inside the data like:

$.ajax
  (
{
  url : handler_url + '?command=abort&wt=json',
  data : {
indent : 'true'
  },
  dataType : 'json',
  type: 'POST',
  context: $( this ),
  beforeSend : function( xhr, settings )


Regards


was (Author: pacoge36):
Quick hack:
go to:
 webapps\solr\js\scripts\dataimport.js 
and change POST by GET :

  $.ajax
  (
{
  url : handler_url + '?command=abort&wt=json',
  dataType : 'json',
  type: 'GET',<
  context: $( this ),
  beforeSend : function( xhr, settings )
  {
span_element
  .addClass( 'loader' );
  },
 
Regards

> Return HTTP error on POST requests with no Content-Type
> ---
>
> Key: SOLR-5517
> URL: https://issues.apache.org/jira/browse/SOLR-5517
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Assignee: Ryan Ernst
> Fix For: 4.7, 5.0
>
> Attachments: SOLR-5517.patch, SOLR-5517.patch, SOLR-5517.patch, 
> SOLR-5517.patch, SOLR-5517.patch
>
>
> While the http spec states requests without a content-type should be treated 
> as application/octet-stream, the html spec says instead that post requests 
> without a content-type should be treated as a form 
> (http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1).  It would be 
> nice to allow large search requests from html forms, and not have to rely on 
> the browser to set the content type (since the spec says it doesn't have to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5517) Return HTTP error on POST requests with no Content-Type

2014-03-10 Thread Paco Garcia (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925849#comment-13925849
 ] 

Paco Garcia commented on SOLR-5517:
---

Quick hack:
go to:
 webapps\solr\js\scripts\dataimport.js 
and change POST by GET :

  $.ajax
  (
{
  url : handler_url + '?command=abort&wt=json',
  dataType : 'json',
  type: 'GET',<
  context: $( this ),
  beforeSend : function( xhr, settings )
  {
span_element
  .addClass( 'loader' );
  },
 
Regards

> Return HTTP error on POST requests with no Content-Type
> ---
>
> Key: SOLR-5517
> URL: https://issues.apache.org/jira/browse/SOLR-5517
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
>Assignee: Ryan Ernst
> Fix For: 4.7, 5.0
>
> Attachments: SOLR-5517.patch, SOLR-5517.patch, SOLR-5517.patch, 
> SOLR-5517.patch, SOLR-5517.patch
>
>
> While the http spec states requests without a content-type should be treated 
> as application/octet-stream, the html spec says instead that post requests 
> without a content-type should be treated as a form 
> (http://www.w3.org/MarkUp/html-spec/html-spec_8.html#SEC8.2.1).  It would be 
> nice to allow large search requests from html forms, and not have to rely on 
> the browser to set the content type (since the spec says it doesn't have to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-5783) Can we stop opening a new searcher when the index hasn't changed?

2014-03-10 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reopened SOLR-5783:



* I'm not really understanding Yonik's alternative suggestion, but i don't have 
the code in front of me -- if there is a better way to accomplish the same 
thing, then great.
* I"m also not really understanding what problems Yonik & MArk are saying exist 
/ may-exist with what got committed as part of this issue -- but it should have 
just been a optimization, if it's causing problems we should definitely roll 
back.
* I'm not really in a position to commit anything over the next few days, and 
then i'm going to be completely offline for over a week -- so if one of you two 
([~yo...@apache.org], [~markrmil...@gmail.com]) who understands why it's 
problematic could please revert this ASAP i'd really appreciate it.
* If you guys could attach patches with tests cases (or pseudo code 
descriptions showing how to create test cases) demonstrating the problems you 
see with the current code that would be really helpful when i finally get a 
chance to revisit this in a few weeks.



> Can we stop opening a new searcher when the index hasn't changed?
> -
>
> Key: SOLR-5783
> URL: https://issues.apache.org/jira/browse/SOLR-5783
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Fix For: 4.8, 5.0
>
> Attachments: SOLR-5783.patch, SOLR-5783.patch, SOLR-5783.patch, 
> SOLR-5783.patch
>
>
> I've been thinking recently about how/when we re-open searchers -- and what 
> the overhead of that is in terms of caches and what not -- even if the 
> underlying index hasn't changed.  
> The particular real world case that got me thinking about this recently is 
> when a deleteByQuery gets forwarded to all shards in a collection, and then 
> the subsequent (soft)Commit (either auto or explicit) opens a new searcher -- 
> even if that shard was completley uneffected by the delete.
> It got me wondering: why don't re-use the same searcher when the index is 
> unchanged?
> From what I can tell, we're basically 99% of the way there (in 
> {{}})...
> * IndexWriter.commit is already smart enough to short circut if there's 
> nothing to commit
> * SolrCore.openNewSearcher already uses DirectoryReader.openIfChanged to see 
> if the reader can be re-used.
> * for "realtime" purposes, SolrCore.openNewSearcher will return the existing 
> searcher if it exists and the DirectoryReader hasn't changed
> ...The only reason I could think of for not _always_ re-using the same 
> searcher when the underlying DirectoryReader is identical (ie: that last 
> bullet above) is in the situation where the "live" schema has changed -- but 
> that seems pretty trivial to account for.
> Is there any other reason why this wouldn't be a good idea for improving 
> performance?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5846) EnumField docValues funtionality

2014-03-10 Thread Elran Dvir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elran Dvir updated SOLR-5846:
-

Attachment: SOLR-5846.patch

> EnumField docValues funtionality
> 
>
> Key: SOLR-5846
> URL: https://issues.apache.org/jira/browse/SOLR-5846
> Project: Solr
>  Issue Type: Improvement
>Reporter: Elran Dvir
> Attachments: SOLR-5846.patch
>
>
> I have added docValues functionality to EnumField.
> Please review the patch attached.
> If there is any problem with it, please let me know.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5846) EnumField docValues funtionality

2014-03-10 Thread Elran Dvir (JIRA)
Elran Dvir created SOLR-5846:


 Summary: EnumField docValues funtionality
 Key: SOLR-5846
 URL: https://issues.apache.org/jira/browse/SOLR-5846
 Project: Solr
  Issue Type: Improvement
Reporter: Elran Dvir


I have added docValues functionality to EnumField.
Please review the patch attached.
If there is any problem with it, please let me know.
 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5653) Create a RESTManager to provide REST API endpoints for reconfigurable plugins

2014-03-10 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-5653:
-

Attachment: SOLR-5653.patch

Patch, building on Tim's patch. 

 Left to do:
* init args should be moved to {{NamedList}} (typed nested values) instead of 
the current String->String map, to support {{solrconfig.xml}} plugin init args
* javadocs should be added where there are none

This patch has some minor cleanups, as well as the following changes:

* Renamed {{SolrRestApi}} -> {{SolrSchemaRestApi}}
* Enabled short-form {{"solr.classname"}} class lookup for 
{{o.a.s.rest.schema.analysis}} (e.g. {{"solr.ManagedWordSetResource"}})
* Finished the {{BaseSchemaResource}} -> {{BaseSolrResource}} renaming by 
executing {{svn mv \[...\]/BaseSchemaResource \[...\]/BaseSolrResource}} (to 
retain svn history) and making all classes extending {{BaseSchemaResource}} 
extend {{BaseSolrResource}} instead
* Removed {{DefaultSchemaResource.java}}; unknown URI paths under {{/schema}} 
and {{/config}} are now handled by {{RestManager.ManagedEndpoint}}
* {{RestManager.Registry}} now protects against registration of resourceId-s 
that are already in use by the Schema REST API - protecting {{/config/managed}} 
and {{/schema/managed}} is now handled via this general mechanism
* {{TestRestManager}}:
** added tests that already-spoken-for REST API endpoints can't be registered
** added tests for switching {{ignoreCase}} of {{ManagedWordSetResource}}
** added XML response format test

* {{ManagedWordSetResource.updateInitArgs()}}:
** compare current/updated {{ignoreCase}} vals as booleans, instead of as 
string args
** throw an exception if current {{ignoreCase}} = true and updated 
{{ignoreCase}} = false, since change this is not permitted
* In {{RestManager.addManagedResource()}}, now {{assert}}'ing that the 
resourceId validation result from {{matches()}} is true, rather than throwing 
away the result; {{registry.registerManagedResource()}}, called earlier in 
{{addManagedResource()}}, already ensures that the regex matches against the 
resourceId.


> Create a RESTManager to provide REST API endpoints for reconfigurable plugins
> -
>
> Key: SOLR-5653
> URL: https://issues.apache.org/jira/browse/SOLR-5653
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Steve Rowe
> Attachments: SOLR-5653.patch, SOLR-5653.patch, SOLR-5653.patch, 
> SOLR-5653.patch
>
>
> It should be possible to reconfigure Solr plugins' resources and init params 
> without directly editing the serialized schema or {{solrconfig.xml}} (see 
> Hoss's arguments about this in the context of the schema, which also apply to 
> {{solrconfig.xml}}, in the description of SOLR-4658)
> The RESTManager should allow plugins declared in either the schema or in 
> {{solrconfig.xml}} to register one or more REST endpoints, one endpoint per 
> reconfigurable resource, including init params.  To allow for multiple plugin 
> instances, registering plugins will need to provide a handle of some form to 
> distinguish the instances.
> This RESTManager should also be able to create new instances of plugins that 
> it has been configured to allow.  The RESTManager will need its own 
> serialized configuration to remember these plugin declarations.
> Example endpoints:
> * SynonymFilterFactory
> ** init params: {{/solr/collection1/config/syns/myinstance/options}}
> ** synonyms resource: 
> {{/solr/collection1/config/syns/myinstance/synonyms-list}}
> * "/select" request handler
> ** init params: {{/solr/collection1/config/requestHandlers/select/options}}
> We should aim for full CRUD over init params and structured resources.  The 
> plugins will bear responsibility for handling resource modification requests, 
> though we should provide utility methods to make this easy.
> However, since we won't be directly modifying the serialized schema and 
> {{solrconfig.xml}}, anything configured in those two places can't be 
> invalidated by configuration serialized elsewhere.  As a result, it won't be 
> possible to remove plugins declared in the serialized schema or 
> {{solrconfig.xml}}.  Similarly, any init params declared in either place 
> won't be modifiable.  Instead, there should be some form of init param that 
> declares that the plugin is reconfigurable, maybe using something like 
> "managed" - note that request handlers already provide a "handle" - the 
> request handler name - and so don't need that to be separately specified:
> {code:xml}
> 
>
> 
> {code}
> and in the serialized schema - a handle needs to be specified here:
> {code:xml}
>  positionIncrementGap="100">
> ...
>   
> 
> 
> ...
> {code}
> All of the above examples use the existing plugin factory class names, but 
> we'll have to create new RESTManager-aware classes to ha

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2014-03-10 Thread Elran Dvir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925787#comment-13925787
 ] 

Elran Dvir commented on SOLR-2894:
--

Hi,

I don't know where I should exactly put the test in DistributedFacetPivotTest, 
but this the test:
1)index more than 100 docs (you can index docs only with id)
2)run  the following query:
this.query( "q", "*:*",
"rows", "0",
"facet","true",
"facet.pivot","id",
"f.id.facet.limit","-1");

you expect to get as many ids as you indexed, but you will get only 100.

Thanks.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.7
>
> Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> dateToObject.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2014-03-10 Thread Brett Lucey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925768#comment-13925768
 ] 

Brett Lucey commented on SOLR-2894:
---

Elran - Can you give me an example test case or query for which the -1 facet 
limit fails?  I'll be glad to take a look and fix it if I can reproduce an 
issue with it.

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.7
>
> Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> dateToObject.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Mark Miller
[.] Move Lucene/Solr 4.8 (means branch_4x) to Java 7 and backport all Java
7-related issues (FileChannel improvements, diamond operator,...).

+1

[.] Move Lucene/Solr trunk to Java 8 and allow closures in source code.
This would make some APIs much nicer. Our infrastructure mostly supports
this, only ECJ Javadoc linting is not yet possible, but forbidden-apis
supports Java 8 with all its crazy new stuff.

-1 this soon.

Mark


-- 
- Mark

http://about.me/markrmiller


RE: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Uwe Schindler
Hi Yonik,
 
> Excuse me?  I assume that's aimed at me?

No, that was not aimed at you. It was a general comment. There are many 
examples of companies /users that fork trunk. Lucid Imagination was one of them 
(the product LucidWorks was initially based on trunk, as far as I remember). 
Also P.S. has a forked Solr distribution. And I think HelioSearch, too.

So it was not against you. It was just a comment, that those people should not 
fork trunk and release it, but instead fork branch_4x and do their releases 
from there. Trunk is playground.

> Your'e insinuating I voted against moving to Java8 right now because of my
> corporate interests?

The whole discussion with Java 8 was clearly about corporate interests, every 
reply had some wording around "company" in it. Trunk is playground and we are 
free to use Java 8 there, without taking care of users. Trunk is for 
developers, not the users. If we have a feature for release, we may backport to 
4.x (at the moment). Unfortunately we do the backport quite often too early, 
releasing unbaked APIs.

> I'd appreciate it if you didn't try to disparage my character every time we
> happen to disagree.

Sorry, wasn't my intention. I like you very much as a character, we met several 
times and had great discussions. I have no personal problem with you.

Uwe



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5512) Remove redundant typing (diamond operator) in trunk

2014-03-10 Thread Furkan KAMACI (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Furkan KAMACI updated LUCENE-5512:
--

Attachment: LUCENE-5512.patch

Lucene part is OK. I will appy same procedure to the Solr module too.

> Remove redundant typing (diamond operator) in trunk
> ---
>
> Key: LUCENE-5512
> URL: https://issues.apache.org/jira/browse/LUCENE-5512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-5512.patch, LUCENE-5512.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5844) Backward Compatibility Has Broken For deleteById() at Solrj

2014-03-10 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-5844:


Assignee: Noble Paul

> Backward Compatibility Has Broken For deleteById() at Solrj
> ---
>
> Key: SOLR-5844
> URL: https://issues.apache.org/jira/browse/SOLR-5844
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6, 4.6.1, 4.7
>Reporter: Furkan KAMACI
>Assignee: Noble Paul
> Fix For: 4.8
>
>
> I have started up a SolrCloud of 4.5.1 
> * When I use deleteById method of CloudSolrServer via 4.5.1 Solrj it works.
> * When I use deleteById method of CloudSolrServer via 4.6.0 Solrj it does not 
> work and does not throw error.
> * When I use deleteById method of CloudSolrServer via 4.6.1 Solrj it does not 
> work and does not throw error.
> * When I use deleteById method of CloudSolrServer via 4.7.0 Solrj it does not 
> work and does not throw error.
> So it seems that backward compatibility has broken since 4.6.0 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Stalled unit tests

2014-03-10 Thread Terry Smith
Shalin: That makes sense. Both the machines I used for testing have SSDs.



On Mon, Mar 10, 2014 at 9:35 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> In my experience, the test suite is much faster on an SSD. Around 18
> minutes on my mac book pro and 12 minutes on my PC for just the Solr
> tests with -Dtests.slow=true (both have SSDs)
>
> On Mon, Mar 10, 2014 at 7:02 PM, Terry Smith  wrote:
> > Oops, the second set of timings on the Mid 2012 MacBook Pro were for JUST
> > the solr tests.
> >
> >
> >
> > On Mon, Mar 10, 2014 at 9:31 AM, Terry Smith  wrote:
> >>
> >> Dawid: Boy, those are some large timeouts!
> >>
> >> Mike: The build.properties suggestion resolved my issue. I can now run
> the
> >> test to completion.
> >>
> >> On a Mid 2009 MacBook Pro running Mavericks and using Java 6 executing
> ant
> >> from the top level of the lucene-solr project I get the following
> timings:
> >>
> >> ant clean compile -- 3 minutes
> >> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 55 minutes
> >> ant clean test (tests.disableHdfs=true) -- 88 minutes
> >>
> >> On a Mid 2012 MacBook Pro with the same software stack:
> >>
> >> ant clean compile -- 1 minute
> >> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 8 minutes
> >>
> >> All running from the same git commit mentioned at the top of this
> thread.
> >>
> >> The tests make great use of multiple CPU/cores so a faster machine
> makes a
> >> huge difference to the total runtime.
> >>
> >> Do the HDFS tests fail due to test bugs or implementation issues?
> >>
> >> How do you feel about changing the default value of tests.disableHdfs to
> >> true versus updating the wiki documentation to let knew contributors
> know
> >> how to work around this?
> >>
> >> --Terry
> >>
> >>
> >>
> >>
> >> On Fri, Mar 7, 2014 at 12:46 PM, Michael McCandless
> >>  wrote:
> >>>
> >>> I just ran "ant test" under Solr; it took 4 minutes 25 seconds.
> >>>
> >>> But, in my ~/build.properties I have:
> >>>
> >>> tests.disableHdfs=true
> >>> tests.slow=false
> >>>
> >>> Which makes things substantially faster, and also [seems to] sidestep
> >>> the Solr tests that false fail.
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>>
> >>> On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith  wrote:
> >>> > Mike,
> >>> >
> >>> > Fair enough. I'll let them run for more than 30 minutes and see what
> >>> > happens.
> >>> >
> >>> > How long does it take on your machine? I'm happy to signup for the
> wiki
> >>> > and
> >>> > add some extra information to
> >>> > http://wiki.apache.org/lucene-java/HowToContribute for folks
> wanting to
> >>> > tinker with Lucene.
> >>> >
> >>> > Do the Lucene developers typically run a subset of the test suite to
> >>> > make
> >>> > committing cheaper?
> >>> >
> >>> > Thanks,
> >>> >
> >>> > --Terry
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless
> >>> >  wrote:
> >>> >>
> >>> >> Unfortunately, some tests take a very long time, and the test infra
> >>> >> will print these HEARTBEAT messages notifying you that they are
> still
> >>> >> running.  They should eventually finish?
> >>> >>
> >>> >> Mike McCandless
> >>> >>
> >>> >> http://blog.mikemccandless.com
> >>> >>
> >>> >>
> >>> >> On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith 
> wrote:
> >>> >> > I'm sure that I'm just missing something obvious but I'm having
> >>> >> > trouble
> >>> >> > getting the unit tests to run to completion on my laptop and was
> >>> >> > hoping
> >>> >> > that
> >>> >> > someone would be kind enough to point me in the right direction.
> >>> >> >
> >>> >> > I've cloned the repository from GitHub
> >>> >> > (http://git.apache.org/lucene-solr.git) and checked out the
> latest
> >>> >> > commit on
> >>> >> > branch_4x.
> >>> >> >
> >>> >> > commit 6e06247cec1410f32592bfd307c1020b814def06
> >>> >> >
> >>> >> > Author: Robert Muir 
> >>> >> >
> >>> >> > Date:   Thu Mar 6 19:54:07 2014 +
> >>> >> >
> >>> >> >
> >>> >> > disable slow solr tests in smoketester
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > git-svn-id:
> >>> >> >
> >>> >> >
> https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68
> >>> >> >
> >>> >> >
> >>> >> > Executing "ant clean test" from the top level directory of the
> >>> >> > project
> >>> >> > shows
> >>> >> > the tests running but they seems to get stuck in loop with some
> >>> >> > stalled
> >>> >> > heartbeat messages. If I run the tests directly from lucene/ then
> >>> >> > they
> >>> >> > complete successfully after about 10 minutes.
> >>> >> >
> >>> >> > I'm using Java 6 under OS X (10.9.2).
> >>> >> >
> >>> >> > $ java -version
> >>> >> >
> >>> >> > java version "1.6.0_65"
> >>> >> >
> >>> >> > Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
> >>> >> >
> >>> >> > Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed
> mode)
> >>> >> >
> >>> >> >
> >>> >> > My terminal

Re: Stalled unit tests

2014-03-10 Thread Shalin Shekhar Mangar
In my experience, the test suite is much faster on an SSD. Around 18
minutes on my mac book pro and 12 minutes on my PC for just the Solr
tests with -Dtests.slow=true (both have SSDs)

On Mon, Mar 10, 2014 at 7:02 PM, Terry Smith  wrote:
> Oops, the second set of timings on the Mid 2012 MacBook Pro were for JUST
> the solr tests.
>
>
>
> On Mon, Mar 10, 2014 at 9:31 AM, Terry Smith  wrote:
>>
>> Dawid: Boy, those are some large timeouts!
>>
>> Mike: The build.properties suggestion resolved my issue. I can now run the
>> test to completion.
>>
>> On a Mid 2009 MacBook Pro running Mavericks and using Java 6 executing ant
>> from the top level of the lucene-solr project I get the following timings:
>>
>> ant clean compile -- 3 minutes
>> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 55 minutes
>> ant clean test (tests.disableHdfs=true) -- 88 minutes
>>
>> On a Mid 2012 MacBook Pro with the same software stack:
>>
>> ant clean compile -- 1 minute
>> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 8 minutes
>>
>> All running from the same git commit mentioned at the top of this thread.
>>
>> The tests make great use of multiple CPU/cores so a faster machine makes a
>> huge difference to the total runtime.
>>
>> Do the HDFS tests fail due to test bugs or implementation issues?
>>
>> How do you feel about changing the default value of tests.disableHdfs to
>> true versus updating the wiki documentation to let knew contributors know
>> how to work around this?
>>
>> --Terry
>>
>>
>>
>>
>> On Fri, Mar 7, 2014 at 12:46 PM, Michael McCandless
>>  wrote:
>>>
>>> I just ran "ant test" under Solr; it took 4 minutes 25 seconds.
>>>
>>> But, in my ~/build.properties I have:
>>>
>>> tests.disableHdfs=true
>>> tests.slow=false
>>>
>>> Which makes things substantially faster, and also [seems to] sidestep
>>> the Solr tests that false fail.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith  wrote:
>>> > Mike,
>>> >
>>> > Fair enough. I'll let them run for more than 30 minutes and see what
>>> > happens.
>>> >
>>> > How long does it take on your machine? I'm happy to signup for the wiki
>>> > and
>>> > add some extra information to
>>> > http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to
>>> > tinker with Lucene.
>>> >
>>> > Do the Lucene developers typically run a subset of the test suite to
>>> > make
>>> > committing cheaper?
>>> >
>>> > Thanks,
>>> >
>>> > --Terry
>>> >
>>> >
>>> >
>>> > On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless
>>> >  wrote:
>>> >>
>>> >> Unfortunately, some tests take a very long time, and the test infra
>>> >> will print these HEARTBEAT messages notifying you that they are still
>>> >> running.  They should eventually finish?
>>> >>
>>> >> Mike McCandless
>>> >>
>>> >> http://blog.mikemccandless.com
>>> >>
>>> >>
>>> >> On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith  wrote:
>>> >> > I'm sure that I'm just missing something obvious but I'm having
>>> >> > trouble
>>> >> > getting the unit tests to run to completion on my laptop and was
>>> >> > hoping
>>> >> > that
>>> >> > someone would be kind enough to point me in the right direction.
>>> >> >
>>> >> > I've cloned the repository from GitHub
>>> >> > (http://git.apache.org/lucene-solr.git) and checked out the latest
>>> >> > commit on
>>> >> > branch_4x.
>>> >> >
>>> >> > commit 6e06247cec1410f32592bfd307c1020b814def06
>>> >> >
>>> >> > Author: Robert Muir 
>>> >> >
>>> >> > Date:   Thu Mar 6 19:54:07 2014 +
>>> >> >
>>> >> >
>>> >> > disable slow solr tests in smoketester
>>> >> >
>>> >> >
>>> >> >
>>> >> > git-svn-id:
>>> >> >
>>> >> > https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
>>> >> > 13f79535-47bb-0310-9956-ffa450edef68
>>> >> >
>>> >> >
>>> >> > Executing "ant clean test" from the top level directory of the
>>> >> > project
>>> >> > shows
>>> >> > the tests running but they seems to get stuck in loop with some
>>> >> > stalled
>>> >> > heartbeat messages. If I run the tests directly from lucene/ then
>>> >> > they
>>> >> > complete successfully after about 10 minutes.
>>> >> >
>>> >> > I'm using Java 6 under OS X (10.9.2).
>>> >> >
>>> >> > $ java -version
>>> >> >
>>> >> > java version "1.6.0_65"
>>> >> >
>>> >> > Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
>>> >> >
>>> >> > Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
>>> >> >
>>> >> >
>>> >> > My terminal lists repeating stalled heartbeat messages like so:
>>> >> >
>>> >> > HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for
>>> >> > 2111s
>>> >> > at: HdfsLockFactoryTest.testBasic
>>> >> >
>>> >> > HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for
>>> >> > 2108s
>>> >> > at: TestSurroundQueryParser.testQueryParser
>>> >> >
>>> >> > HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for
>>> >> > 2167s
>>> >> > at: TestRecoveryHdfs.te

Re: Stalled unit tests

2014-03-10 Thread Terry Smith
Oops, the second set of timings on the Mid 2012 MacBook Pro were for JUST
the solr tests.



On Mon, Mar 10, 2014 at 9:31 AM, Terry Smith  wrote:

> Dawid: Boy, those are some large timeouts!
>
> Mike: The build.properties suggestion resolved my issue. I can now run the
> test to completion.
>
> On a Mid 2009 MacBook Pro running Mavericks and using Java 6 executing ant
> from the top level of the lucene-solr project I get the following timings:
>
> ant clean compile -- 3 minutes
> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 55 minutes
> ant clean test (tests.disableHdfs=true) -- 88 minutes
>
> On a Mid 2012 MacBook Pro with the same software stack:
>
> ant clean compile -- 1 minute
> ant clean test (tests.disableHdfs=true, tests.slow=false) -- 8 minutes
>
> All running from the same git commit mentioned at the top of this thread.
>
> The tests make great use of multiple CPU/cores so a faster machine makes a
> huge difference to the total runtime.
>
> Do the HDFS tests fail due to test bugs or implementation issues?
>
> How do you feel about changing the default value of tests.disableHdfs to
> true versus updating the wiki documentation to let knew contributors know
> how to work around this?
>
> --Terry
>
>
>
>
> On Fri, Mar 7, 2014 at 12:46 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> I just ran "ant test" under Solr; it took 4 minutes 25 seconds.
>>
>> But, in my ~/build.properties I have:
>>
>> tests.disableHdfs=true
>> tests.slow=false
>>
>> Which makes things substantially faster, and also [seems to] sidestep
>> the Solr tests that false fail.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith  wrote:
>> > Mike,
>> >
>> > Fair enough. I'll let them run for more than 30 minutes and see what
>> > happens.
>> >
>> > How long does it take on your machine? I'm happy to signup for the wiki
>> and
>> > add some extra information to
>> > http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to
>> > tinker with Lucene.
>> >
>> > Do the Lucene developers typically run a subset of the test suite to
>> make
>> > committing cheaper?
>> >
>> > Thanks,
>> >
>> > --Terry
>> >
>> >
>> >
>> > On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless
>> >  wrote:
>> >>
>> >> Unfortunately, some tests take a very long time, and the test infra
>> >> will print these HEARTBEAT messages notifying you that they are still
>> >> running.  They should eventually finish?
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith  wrote:
>> >> > I'm sure that I'm just missing something obvious but I'm having
>> trouble
>> >> > getting the unit tests to run to completion on my laptop and was
>> hoping
>> >> > that
>> >> > someone would be kind enough to point me in the right direction.
>> >> >
>> >> > I've cloned the repository from GitHub
>> >> > (http://git.apache.org/lucene-solr.git) and checked out the latest
>> >> > commit on
>> >> > branch_4x.
>> >> >
>> >> > commit 6e06247cec1410f32592bfd307c1020b814def06
>> >> >
>> >> > Author: Robert Muir 
>> >> >
>> >> > Date:   Thu Mar 6 19:54:07 2014 +
>> >> >
>> >> >
>> >> > disable slow solr tests in smoketester
>> >> >
>> >> >
>> >> >
>> >> > git-svn-id:
>> >> >
>> https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
>> >> > 13f79535-47bb-0310-9956-ffa450edef68
>> >> >
>> >> >
>> >> > Executing "ant clean test" from the top level directory of the
>> project
>> >> > shows
>> >> > the tests running but they seems to get stuck in loop with some
>> stalled
>> >> > heartbeat messages. If I run the tests directly from lucene/ then
>> they
>> >> > complete successfully after about 10 minutes.
>> >> >
>> >> > I'm using Java 6 under OS X (10.9.2).
>> >> >
>> >> > $ java -version
>> >> >
>> >> > java version "1.6.0_65"
>> >> >
>> >> > Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
>> >> >
>> >> > Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
>> >> >
>> >> >
>> >> > My terminal lists repeating stalled heartbeat messages like so:
>> >> >
>> >> > HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for
>> >> > 2111s
>> >> > at: HdfsLockFactoryTest.testBasic
>> >> >
>> >> > HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for
>> >> > 2108s
>> >> > at: TestSurroundQueryParser.testQueryParser
>> >> >
>> >> > HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for
>> >> > 2167s
>> >> > at: TestRecoveryHdfs.testBuffering
>> >> >
>> >> > HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for
>> >> > 2165s
>> >> > at: HdfsDirectoryTest.testEOF
>> >> >
>> >> >
>> >> > My machine does have 3 java processes chewing CPU, see attached
>> jstack
>> >> > dumps
>> >> > for more information.
>> >> >
>> >> > Should I expect the tests to complete on my platform? Do I need to
>> >> > specify
>> >> > any sp

Re: Stalled unit tests

2014-03-10 Thread Terry Smith
Dawid: Boy, those are some large timeouts!

Mike: The build.properties suggestion resolved my issue. I can now run the
test to completion.

On a Mid 2009 MacBook Pro running Mavericks and using Java 6 executing ant
from the top level of the lucene-solr project I get the following timings:

ant clean compile -- 3 minutes
ant clean test (tests.disableHdfs=true, tests.slow=false) -- 55 minutes
ant clean test (tests.disableHdfs=true) -- 88 minutes

On a Mid 2012 MacBook Pro with the same software stack:

ant clean compile -- 1 minute
ant clean test (tests.disableHdfs=true, tests.slow=false) -- 8 minutes

All running from the same git commit mentioned at the top of this thread.

The tests make great use of multiple CPU/cores so a faster machine makes a
huge difference to the total runtime.

Do the HDFS tests fail due to test bugs or implementation issues?

How do you feel about changing the default value of tests.disableHdfs to
true versus updating the wiki documentation to let knew contributors know
how to work around this?

--Terry




On Fri, Mar 7, 2014 at 12:46 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> I just ran "ant test" under Solr; it took 4 minutes 25 seconds.
>
> But, in my ~/build.properties I have:
>
> tests.disableHdfs=true
> tests.slow=false
>
> Which makes things substantially faster, and also [seems to] sidestep
> the Solr tests that false fail.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith  wrote:
> > Mike,
> >
> > Fair enough. I'll let them run for more than 30 minutes and see what
> > happens.
> >
> > How long does it take on your machine? I'm happy to signup for the wiki
> and
> > add some extra information to
> > http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to
> > tinker with Lucene.
> >
> > Do the Lucene developers typically run a subset of the test suite to make
> > committing cheaper?
> >
> > Thanks,
> >
> > --Terry
> >
> >
> >
> > On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless
> >  wrote:
> >>
> >> Unfortunately, some tests take a very long time, and the test infra
> >> will print these HEARTBEAT messages notifying you that they are still
> >> running.  They should eventually finish?
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith  wrote:
> >> > I'm sure that I'm just missing something obvious but I'm having
> trouble
> >> > getting the unit tests to run to completion on my laptop and was
> hoping
> >> > that
> >> > someone would be kind enough to point me in the right direction.
> >> >
> >> > I've cloned the repository from GitHub
> >> > (http://git.apache.org/lucene-solr.git) and checked out the latest
> >> > commit on
> >> > branch_4x.
> >> >
> >> > commit 6e06247cec1410f32592bfd307c1020b814def06
> >> >
> >> > Author: Robert Muir 
> >> >
> >> > Date:   Thu Mar 6 19:54:07 2014 +
> >> >
> >> >
> >> > disable slow solr tests in smoketester
> >> >
> >> >
> >> >
> >> > git-svn-id:
> >> >
> https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
> >> > 13f79535-47bb-0310-9956-ffa450edef68
> >> >
> >> >
> >> > Executing "ant clean test" from the top level directory of the project
> >> > shows
> >> > the tests running but they seems to get stuck in loop with some
> stalled
> >> > heartbeat messages. If I run the tests directly from lucene/ then they
> >> > complete successfully after about 10 minutes.
> >> >
> >> > I'm using Java 6 under OS X (10.9.2).
> >> >
> >> > $ java -version
> >> >
> >> > java version "1.6.0_65"
> >> >
> >> > Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
> >> >
> >> > Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
> >> >
> >> >
> >> > My terminal lists repeating stalled heartbeat messages like so:
> >> >
> >> > HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for
> >> > 2111s
> >> > at: HdfsLockFactoryTest.testBasic
> >> >
> >> > HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for
> >> > 2108s
> >> > at: TestSurroundQueryParser.testQueryParser
> >> >
> >> > HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for
> >> > 2167s
> >> > at: TestRecoveryHdfs.testBuffering
> >> >
> >> > HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for
> >> > 2165s
> >> > at: HdfsDirectoryTest.testEOF
> >> >
> >> >
> >> > My machine does have 3 java processes chewing CPU, see attached jstack
> >> > dumps
> >> > for more information.
> >> >
> >> > Should I expect the tests to complete on my platform? Do I need to
> >> > specify
> >> > any special flags to give them more memory or to avoid any bad apples?
> >> >
> >> > Thanks in advance,
> >> >
> >> > --Terry
> >> >
> >> >
> >> >
> >> >
> >> > -
> >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> > For additional commands, e-mail: dev-h...

[jira] [Commented] (LUCENE-5512) Remove redundant typing (diamond operator) in trunk

2014-03-10 Thread Furkan KAMACI (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925730#comment-13925730
 ] 

Furkan KAMACI commented on LUCENE-5512:
---

I'm running tests for Lucene for last time. If all tests pass I will add patch. 
When I finish Solr part I will start to try-with resources.

> Remove redundant typing (diamond operator) in trunk
> ---
>
> Key: LUCENE-5512
> URL: https://issues.apache.org/jira/browse/LUCENE-5512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-5512.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Simon Willnauer
I agree with you on the fact that we should always try to innovate.
Moving to Java8 is innovation and we should do it rather sooner than
later. I just don't think we should move there right before it's even
released. I can totally see this vote coming in in a couple of month
once we have fixed the bugs taht come with such a huge thing and then
move. It might also help us in the future to think more about what we
should make trunk only etc.
I can totally see that a large user base might have problems with
using 1.8 in production but that is still a year out or so anyways
(Lucene 5.0 I mean). It's controversal but we should rethink more
often as we do now and move forward on 4.x I think it's good!
progress over perfection :)

simon

On Mon, Mar 10, 2014 at 12:57 PM, Robert Muir  wrote:
> Its too sad this decision isn't about what is best for attracting new
> developers, but instead corrupted by corporate policies around JVM
> versions and the like.
>
> what a shame, open source isn't supposed to be like that.
>
> On Mon, Mar 10, 2014 at 5:46 AM, Uwe Schindler  wrote:
>> Hi,
>>
>> it looks like we all agree on the same:
>>
>> +1 for Lucene 4.x requirement on Java 7.
>> -1 to not change trunk (keep it on Java 7,too).
>>
>> I will keep this vote open until this evening, but I don't expect any other 
>> change. Indeed, there are no real technical reasons to not move.
>>
>> I was expecting the fact that the majority -1 on trunk with Java 8. Simon 
>> said, that we may provide closures in the API in the future, but for our 
>> public API that’s still not a must to actually be on Java 8: If we define 
>> our interfaces nicely (using 1-method functional *interface*, no abstract 
>> classes, only interfaces!), everybody on Java 8 can use closures although 
>> Lucene is on Java 7. Maybe in the future we can have a TokenStream variant 
>> with push-semantics using closures!
>>
>> I opened https://issues.apache.org/jira/browse/LUCENE-5514 to manage the 
>> backport. The initial patch covering many commits is already ready to 
>> commit. I just have to take the time until this vote finishes, to check that 
>> all stuff like smoke tester, javadocs linting,... work as expected.
>>
>> Theoretically, we might also only change Lucene 4.x's build to Java 7 
>> without any code change, but we should also provide some real reason for the 
>> move! Otherwise people will start to complain and "patch" Lucene 4.8 to 
>> still support Java 6 and Android mobile phones :-)
>>
>> The backported issues bring real improvements to the user and make usage 
>> with Java 6 impossible:
>> - Use of FileChannel's new open method (this allows deleting files while 
>> open on Windows)
>> - Use of Long.compare(long,long) and Integer.compare(int,int) instead of the 
>> hacks with Long.signum() or 3 way branches. Hotspot aggressively handles 
>> those methods and they may get intrinsics in the future. So we should really 
>> use them.
>> The above issue has primarily focused on backporting these changes and 
>> reverting "quick fix commits in 4.x" (after failed Jenkins builds).
>>
>> In the future we have now only one supported Java version, so backports are 
>> very easy. Also releasing 4.x is much easier now, because Javadocs look fine 
>> now by default. We can now also proceed with using diamond operator and 
>> try-with-resources (much more important than diamond), without the need for 
>> backports being hard. So feel free to commit any Java 7 syntax once 
>> LUCENE-5514 is resolved!
>>
>> Uwe
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>> -Original Message-
>>> From: Uwe Schindler [mailto:u...@thetaphi.de]
>>> Sent: Saturday, March 08, 2014 5:17 PM
>>> To: dev@lucene.apache.org
>>> Subject: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once
>>> officially released)
>>>
>>> Hi all,
>>>
>>> Java 8 will get released (hopefully, but I trust the release plan!) on 
>>> March 18,
>>> 2014. Because of this, lots of developers will move to Java 8, too. This 
>>> makes
>>> maintaining 3 versions for developing Lucene 4.x not easy anymore (unless
>>> you have cool JAVA_HOME "cmd" launcher scripts using StExBar available for
>>> your Windows Explorer - or similar stuff in Linux/Mäc).
>>>
>>> We already discussed in another thread about moving to release trunk as 5.0,
>>> but people disagreed and preferred to release 4.8 with a minimum of Java 7.
>>> This is perfectly fine, as nobody should run Lucene or Solr on an 
>>> unsupported
>>> platform anymore. If they upgrade to 4.8, they should also upgrade their
>>> infrastructure - this is a no-brainer. In Lucene trunk we switch to Java 8 
>>> as
>>> soon as it is released (in 10 days).
>>>
>>> Now the good things: We don't need to support JRockit anymore, no need to
>>> support IBM J9 in trunk (unless they release a new version based on Java 8).
>>>
>>> So the vote here is about:
>>>
>>> [.] Move Luc

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0-fcs) - Build # 1398 - Failure!

2014-03-10 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1398/
Java: 64bit/jdk1.8.0-fcs -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
REGRESSION:  org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT

Error Message:
SOLR-5815? : wrong maxDoc: core=org.apache.solr.core.SolrCore@5d1fbdfd 
searcher=Searcher@7b5d59d5[collection1] 
main{StandardDirectoryReader(segments_8:16 _4(5.0):C1 _5(5.0):C1)} expected:<3> 
but was:<2>

Stack Trace:
java.lang.AssertionError: SOLR-5815? : wrong maxDoc: 
core=org.apache.solr.core.SolrCore@5d1fbdfd 
searcher=Searcher@7b5d59d5[collection1] 
main{StandardDirectoryReader(segments_8:16 _4(5.0):C1 _5(5.0):C1)} expected:<3> 
but was:<2>
at 
__randomizedtesting.SeedInfo.seed([16318555406A7EAC:A3B7E4D2FFABCC58]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.core.TestNonNRTOpen.assertNotNRT(TestNonNRTOpen.java:142)
at 
org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT(TestNonNRTOpen.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleI

Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
Solr/Lucene 4.8 -> Java 7
+1

I'm not too sure about moving trunk to java 8 . Let's keep it at java 7 and
make a call when we are closer to Lecene-Solr 5.0. Organizations move to
newer versions of java very slowly.


On Mon, Mar 10, 2014 at 6:30 PM, Uwe Schindler  wrote:

> Hi Robert,
>
> > the vote must be held open for 72 hours. I haven't even had a chance to
> > formulate my VOTE+reasoning yet, and i dont agree with this crap here.
>
> Indeed, there is no need to hurry! I just wanted more discussions coming
> in.
> The merges I prepared already are stable and pass all tests, smokers,...
> So no problem to wait 2 more days, it is not urgent to commit my branch_4x
> checkout.
>
> As said in the thread already, I expected the reaction from our
> company-users/company-committers. I disagree, too, but it looks like more
> people are against this and that won't change anymore.
> I agree with you: "trunk" is our development branch, I see no problem with
> making it Java 8 only. From the other issue, we have no important news to
> actually release this as 5.0 soon, so we can for sure play with it for long
> time. To me it looks like some of our committers have forks off trunk they
> want to sell to their customers.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Robert Muir [mailto:rcm...@gmail.com]
> > Sent: Monday, March 10, 2014 1:34 PM
> > To: dev@lucene.apache.org
> > Subject: Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in
> trunk
> > (once officially released)
> >
> > On Mon, Mar 10, 2014 at 5:46 AM, Uwe Schindler 
> > wrote:
> > > Hi,
> > >
> > > it looks like we all agree on the same:
> > >
> > > +1 for Lucene 4.x requirement on Java 7.
> > > -1 to not change trunk (keep it on Java 7,too).
> > >
> > > I will keep this vote open until this evening, but I don't expect any
> other
> > change. Indeed, there are no real technical reasons to not move.
> >
> > the vote must be held open for 72 hours. I haven't even had a chance to
> > formulate my VOTE+reasoning yet, and i dont agree with this crap here.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> > commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
-
Noble Paul


[jira] [Commented] (LUCENE-5475) add required attribute bugUrl to @BadApple

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925725#comment-13925725
 ] 

Robert Muir commented on LUCENE-5475:
-

Thanks Dawid!

> add required attribute bugUrl to @BadApple
> --
>
> Key: LUCENE-5475
> URL: https://issues.apache.org/jira/browse/LUCENE-5475
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Robert Muir
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5475.patch
>
>
> This makes it impossible to tag a test as a badapple without a pointer to a 
> JIRA issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-5762) SOLR-5658 broke backward compatibility of Javabin format

2014-03-10 Thread Erick Erickson
Hmmm, scanning just Noble's comment it's even worse since we have custom
components that may define their own params that other components know
nothing about (and can't).

But I'm glancing at this out of context so may be off in the weeds.
On Mar 10, 2014 4:41 AM, "Noble Paul (JIRA)"  wrote:

>
> [
> https://issues.apache.org/jira/browse/SOLR-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925564#comment-13925564]
>
> Noble Paul commented on SOLR-5762:
> --
>
> Solr behaves a lot like a web container where the various components
> determine the parameters accepted. In any given request various components
> can participate in fulfilling the request. Erroring out for unexpected
> params means that we will have to keep a database of all parameters from
> all components. It is going to cause a lot of problems for us devs as well
> as the suers
>
> > SOLR-5658 broke backward compatibility of Javabin format
> > 
> >
> > Key: SOLR-5762
> > URL: https://issues.apache.org/jira/browse/SOLR-5762
> > Project: Solr
> >  Issue Type: Bug
> >Affects Versions: 4.6.1, 4.7
> >Reporter: Noble Paul
> > Fix For: 4.7, 4.8, 5.0
> >
> > Attachments: SOLR-5672.patch, SOLR-5762-test.patch,
> SOLR-5762.patch, updateReq_4_5.bin
> >
> >
> > In SOLR-5658 the docsMap entry was changed from a Map to List this
> broke  back compat of older clients with 4.6.1 and later
> > {noformat}
> > ERROR - 2014-02-20 21:28:36.332; org.apache.solr.common.SolrException;
> java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to
> java.util.List
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:188)
> > at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
> > at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
> > at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> > at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> > at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)
> > at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:368)
> > at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> > at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
> > at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
> > at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
> > at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
> > at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> org.eclipse.jetty.util.t

Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Yonik Seeley
On Mon, Mar 10, 2014 at 9:00 AM, Uwe Schindler  wrote:
> I agree with you: "trunk" is our development branch, I see no problem with 
> making it Java 8 only. From the other issue, we have no important news to 
> actually release this as 5.0 soon, so we can for sure play with it for long 
> time. To me it looks like some of our committers have forks off trunk they 
> want to sell to their customers.

Excuse me?  I assume that's aimed at me?
Your'e insinuating I voted against moving to Java8 right now because
of my corporate interests?
I'd appreciate it if you didn't try to disparage my character every
time we happen to disagree.

-Yonik
http://heliosearch.org - native off-heap filters and fieldcache for solr

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5512) Remove redundant typing (diamond operator) in trunk

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925724#comment-13925724
 ] 

Robert Muir commented on LUCENE-5512:
-

{quote}
I've finished it. Compilation and tests did not give any error. I will check it 
one more time and attach the patch. On the other hand I will apply changes for 
"lucene" module. Will anybody open a Jira issue for Solr module too or I can 
apply same things for Solr module too?
{quote}

You can just supply one patch here. You can also separate it, if its easier on 
you. Either way.

{quote}
Robert Muir if you want I can do same thing for "try-with resources" at another 
Jira issue?
{quote}

Yes, we should, that one is more complicated, but there are a lot of cleanups 
to be done.

> Remove redundant typing (diamond operator) in trunk
> ---
>
> Key: LUCENE-5512
> URL: https://issues.apache.org/jira/browse/LUCENE-5512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-5512.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5265) Add backward compatibility tests to JavaBinCodec's format.

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925722#comment-13925722
 ] 

ASF subversion and git services commented on SOLR-5265:
---

Commit 1575936 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1575936 ]

SOLR-5265 Add backward compatibility tests to JavaBinCodec's format

> Add backward compatibility tests to JavaBinCodec's format.
> --
>
> Key: SOLR-5265
> URL: https://issues.apache.org/jira/browse/SOLR-5265
> Project: Solr
>  Issue Type: Test
>Reporter: Adrien Grand
>Assignee: Noble Paul
>Priority: Blocker
> Fix For: 4.7
>
> Attachments: SOLR-5265.patch, SOLR-5265.patch, SOLR-5265.patch, 
> javabin_backcompat.bin
>
>
> Since Solr guarantees backward compatibility of JavaBinCodec's format between 
> releases, we should have tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5475) add required attribute bugUrl to @BadApple

2014-03-10 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925718#comment-13925718
 ] 

Dawid Weiss commented on LUCENE-5475:
-

I've just released RR 2.1.0 which contains a fix for this. Will integrate later 
on until somebody beats me to it.

> add required attribute bugUrl to @BadApple
> --
>
> Key: LUCENE-5475
> URL: https://issues.apache.org/jira/browse/LUCENE-5475
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Robert Muir
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5475.patch
>
>
> This makes it impossible to tag a test as a badapple without a pointer to a 
> JIRA issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925717#comment-13925717
 ] 

Uwe Schindler commented on LUCENE-5487:
---

I think we can backport this!

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5265) Add backward compatibility tests to JavaBinCodec's format.

2014-03-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925714#comment-13925714
 ] 

ASF subversion and git services commented on SOLR-5265:
---

Commit 1575932 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1575932 ]

SOLR-5265 Add backward compatibility tests to JavaBinCodec's format

> Add backward compatibility tests to JavaBinCodec's format.
> --
>
> Key: SOLR-5265
> URL: https://issues.apache.org/jira/browse/SOLR-5265
> Project: Solr
>  Issue Type: Test
>Reporter: Adrien Grand
>Assignee: Noble Paul
>Priority: Blocker
> Fix For: 4.7
>
> Attachments: SOLR-5265.patch, SOLR-5265.patch, SOLR-5265.patch, 
> javabin_backcompat.bin
>
>
> Since Solr guarantees backward compatibility of JavaBinCodec's format between 
> releases, we should have tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate "top scorer" from "sub scorer"?

2014-03-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925713#comment-13925713
 ] 

Robert Muir commented on LUCENE-5487:
-

{quote}
I think this (Weight.scorer) is an expert enough API that we can fix it in 4.8 
as well?
{quote}

Definitely, I havent had a chance to review the patch but this seems good to do.

> Can we separate "top scorer" from "sub scorer"?
> ---
>
> Key: LUCENE-5487
> URL: https://issues.apache.org/jira/browse/LUCENE-5487
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.8, 5.0
>
> Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch
>
>
> This is just an exploratory patch ... still many nocommits, but I
> think it may be promising.
> I find the two booleans we pass to Weight.scorer confusing, because
> they really only apply to whoever will call score(Collector) (just
> IndexSearcher and BooleanScorer).
> The params are pointless for the vast majority of scorers, because
> very, very few query scorers really need to change how top-scoring is
> done, and those scorers can *only* score top-level (throw throw UOE
> from nextDoc/advance).  It seems like these two types of scorers
> should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk (once officially released)

2014-03-10 Thread Uwe Schindler
Hi Robert,

> the vote must be held open for 72 hours. I haven't even had a chance to
> formulate my VOTE+reasoning yet, and i dont agree with this crap here.

Indeed, there is no need to hurry! I just wanted more discussions coming in.
The merges I prepared already are stable and pass all tests, smokers,... So no 
problem to wait 2 more days, it is not urgent to commit my branch_4x checkout.

As said in the thread already, I expected the reaction from our 
company-users/company-committers. I disagree, too, but it looks like more 
people are against this and that won't change anymore.
I agree with you: "trunk" is our development branch, I see no problem with 
making it Java 8 only. From the other issue, we have no important news to 
actually release this as 5.0 soon, so we can for sure play with it for long 
time. To me it looks like some of our committers have forks off trunk they want 
to sell to their customers.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Monday, March 10, 2014 1:34 PM
> To: dev@lucene.apache.org
> Subject: Re: [VOTE] Move to Java 7 in Lucene/Solr 4.8, use Java 8 in trunk
> (once officially released)
> 
> On Mon, Mar 10, 2014 at 5:46 AM, Uwe Schindler 
> wrote:
> > Hi,
> >
> > it looks like we all agree on the same:
> >
> > +1 for Lucene 4.x requirement on Java 7.
> > -1 to not change trunk (keep it on Java 7,too).
> >
> > I will keep this vote open until this evening, but I don't expect any other
> change. Indeed, there are no real technical reasons to not move.
> 
> the vote must be held open for 72 hours. I haven't even had a chance to
> formulate my VOTE+reasoning yet, and i dont agree with this crap here.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >