date:20120511

Michael Wyraz created LUCENE-4049:
-

 Summary: PrefixQuery (or it's superclass MultiTermQuery) ignores 
index time boosts
 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.6, 3.5
 Environment: Java
Reporter: Michael Wyraz


It is possible to set boost to fields or documents during indexing, so certain 
documents can be boostes over others. This works well with TermQuery or 
FuzzyQuery but not with PrefixQuery which ignores the individual values.

Test Code below:


import java.io.IOException;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.PrefixQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;

import com.evermind.tools.calendar.StopWatch;


public class LuceneTest
{
public static void main(String[] args) throws Exception
{
Directory index=new RAMDirectory();
StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
analyzer);

IndexWriter w = new IndexWriter(index, config);
addDoc(w, Hello 1,1);
addDoc(w, Hello 2,2);
addDoc(w, Hello 3,1);
w.close();
StopWatch.stop();

IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);

//Query q = new TermQuery(new Term(f1,hello));
Query q = new PrefixQuery(new Term(f1,hello));

TopScoreDocCollector collector = TopScoreDocCollector.create(10, true);
searcher.search(q, collector);
for (ScoreDoc hit: collector.topDocs().scoreDocs)
{
Document d = searcher.doc(hit.doc);
System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
}
}

private static void addDoc(IndexWriter w, String value, float boost) throws 
IOException
{
Document doc = new Document();
doc.add(new Field(f1, value, Field.Store.YES, Field.Index.ANALYZED));
doc.setBoost(boost);
w.addDocument(doc);
}
}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13953 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13953/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=79 closes=77

Stack Trace:
java.lang.AssertionError: ERROR: SolrIndexSearcher opens=79 closes=77
at __randomizedtesting.SeedInfo.seed([FF8963706B7FE441]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 11324 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1337005 - in /lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene: index/AlcoholicMergePolicy.java util/LuceneTestCase.java

2012-05-11 Thread Dawid Weiss

 +  public static enum Drink {
 +
 +    Beer(15), Wine(17), Champagne(21), WhiteRussian(22),
 + SingleMalt(30);
 +
 +
 +    public long drunk() {
 +      return drunkFactor;
 +    }

I think this isn't an independent value. This isn't even a Markov
chain as it doesn't depend on the last state of the observed object
and the drink to follow -- full history of drinks consumed so far
would have to be considered, their order and quantities matter (i.e.,
beer after champagne, singlemalt after beer etc.). Overflows (or
so-called burst points) would certainly have to be empirically
established as there is no theoretical model for them known in
literature...

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: G1 Garbage Collector enabled for Java 7 builds

2012-05-11 Thread Dawid Weiss

+1. I expect lots of bugs to appear soon... I tried G1 at some point
in the past and just couldn't get it to work reliably (it was a longer
while ago though).

Dawid

On Thu, May 10, 2012 at 10:23 PM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I enabled the G1 Garbage Collector for the Java 7 Builds (TEST_JVM_FLAGS).
 If something goes wrong, we have another Java 7 bug... :-) It is not yet
 enabled by default in Java 7 (and not in u4, too - Jenkins runs u3), but it
 might be in the future, so we should test this. Maybe we have a random
 Garbage collector selector for our builds, I am thinking about that.
 Currently it's passed as a env var by the Jenkins config.

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3450) CoreAdminHandler.handleStatusAction

2012-05-11 Thread Per Steffensen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273166#comment-13273166
 ] 

Per Steffensen edited comment on SOLR-3450 at 5/11/12 10:53 AM:


Guess is that FileUtils.sizeOfDirectory starts by listing all files in the 
directory and afterwards works through that list getting the size of each file 
and adding it to a total sum. If a file disappears from the time the directory 
is listed and the time where the algorithm tries to find its size, you will end 
up like this. A file might disappear during index merge. Only guessing.

Might want to be a little more robuste here.

Regards, Per Steffensen

  was (Author: steff1193):
Guess is that FileUtils.sizeOfDirectory starts by listing all files in the 
directory and afterwards works through that list getting the size of each file 
and adding it to a total sum. If a file disappears from the time the directory 
is listed and the time where the algorithm tries to find its size, you will end 
up like this. A file might disappear during index merge. Only guessing.

Regards, Per Steffensen
  
 CoreAdminHandler.handleStatusAction
 ---

 Key: SOLR-3450
 URL: https://issues.apache.org/jira/browse/SOLR-3450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: Linux version 2.6.32-29-server (buildd@allspice) (gcc 
 version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #58-Ubuntu SMP Fri Feb 11 21:06:51 
 UTC 2011
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Reporter: Trym Møller
Priority: Minor

 May 8, 2012 12:49:49 PM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Error handling 'status' action 
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:551)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:161)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.lang.IllegalArgumentException: 
 /usr/lib/solr-4.0/example/dataDir/index.20120419210203/_kvon_0.frq does not 
 exist
 at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2053)
 at 
 org.apache.commons.io.FileUtils.sizeOfDirectory(FileUtils.java:2089)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:837)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:822)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:542)
 ... 21 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To

[jira] [Commented] (SOLR-3450) CoreAdminHandler.handleStatusAction

2012-05-11 Thread Per Steffensen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273166#comment-13273166
 ] 

Per Steffensen commented on SOLR-3450:
--

Guess is that FileUtils.sizeOfDirectory starts by listing all files in the 
directory and afterwards works through that list getting the size of each file 
and adding it to a total sum. If a file disappears from the time the directory 
is listed and the time where the algorithm tries to find its size, you will end 
up like this. A file might disappear during index merge. Only guessing.

Regards, Per Steffensen

 CoreAdminHandler.handleStatusAction
 ---

 Key: SOLR-3450
 URL: https://issues.apache.org/jira/browse/SOLR-3450
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: Linux version 2.6.32-29-server (buildd@allspice) (gcc 
 version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #58-Ubuntu SMP Fri Feb 11 21:06:51 
 UTC 2011
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Reporter: Trym Møller
Priority: Minor

 May 8, 2012 12:49:49 PM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Error handling 'status' action 
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:551)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:161)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:360)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:173)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.lang.IllegalArgumentException: 
 /usr/lib/solr-4.0/example/dataDir/index.20120419210203/_kvon_0.frq does not 
 exist
 at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2053)
 at 
 org.apache.commons.io.FileUtils.sizeOfDirectory(FileUtils.java:2089)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:837)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:822)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:542)
 ... 21 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts


 [ 
https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4049.
-

Resolution: Not A Problem

setRewriteMethod(SCORING_BOOLEAN_QUERY_REWRITE)

 PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
 -

 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.5, 3.6
 Environment: Java
Reporter: Michael Wyraz

 It is possible to set boost to fields or documents during indexing, so 
 certain documents can be boostes over others. This works well with TermQuery 
 or FuzzyQuery but not with PrefixQuery which ignores the individual values.
 Test Code below:
 import java.io.IOException;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.document.Document;
 import org.apache.lucene.document.Field;
 import org.apache.lucene.index.IndexReader;
 import org.apache.lucene.index.IndexWriter;
 import org.apache.lucene.index.IndexWriterConfig;
 import org.apache.lucene.index.Term;
 import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.PrefixQuery;
 import org.apache.lucene.search.Query;
 import org.apache.lucene.search.ScoreDoc;
 import org.apache.lucene.search.TopScoreDocCollector;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.RAMDirectory;
 import org.apache.lucene.util.Version;
 import com.evermind.tools.calendar.StopWatch;
 public class LuceneTest
 {
 public static void main(String[] args) throws Exception
 {
 Directory index=new RAMDirectory();
 StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
 IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
 analyzer);
 
 IndexWriter w = new IndexWriter(index, config);
 addDoc(w, Hello 1,1);
 addDoc(w, Hello 2,2);
 addDoc(w, Hello 3,1);
 w.close();
 StopWatch.stop();
 
 IndexReader reader = IndexReader.open(index);
 IndexSearcher searcher = new IndexSearcher(reader);
 
 //Query q = new TermQuery(new Term(f1,hello));
 Query q = new PrefixQuery(new Term(f1,hello));
 
 TopScoreDocCollector collector = TopScoreDocCollector.create(10, 
 true);
 searcher.search(q, collector);
 for (ScoreDoc hit: collector.topDocs().scoreDocs)
 {
 Document d = searcher.doc(hit.doc);
 System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
 }
 }
 
 private static void addDoc(IndexWriter w, String value, float boost) 
 throws IOException
 {
 Document doc = new Document();
 doc.add(new Field(f1, value, Field.Store.YES, 
 Field.Index.ANALYZED));
 doc.setBoost(boost);
 w.addDocument(doc);
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3842) Analyzing Suggester


 [ 
https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3842:


Attachment: LUCENE-3842.patch

updated patch: i fixed the bug in tokenStreamToAutomaton (just use lastEndPos 
instead)

 Analyzing Suggester
 ---

 Key: LUCENE-3842
 URL: https://issues.apache.org/jira/browse/LUCENE-3842
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, 
 LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch


 Since we added shortest-path wFSA search in LUCENE-3714, and generified the 
 comparator in LUCENE-3801,
 I think we should look at implementing suggesters that have more capabilities 
 than just basic prefix matching.
 In particular I think the most flexible approach is to integrate with 
 Analyzer at both build and query time,
 such that we build a wFST with:
 input: analyzed text such as ghost0christmas0past -- byte 0 here is an 
 optional token separator
 output: surface form such as the ghost of christmas past
 weight: the weight of the suggestion
 we make an FST with PairOutputsweight,output, but only do the shortest path 
 operation on the weight side (like
 the test in LUCENE-3801), at the same time accumulating the output (surface 
 form), which will be the actual suggestion.
 This allows a lot of flexibility:
 * Using even standardanalyzer means you can offer suggestions that ignore 
 stopwords, e.g. if you type in ghost of chr...,
   it will suggest the ghost of christmas past
 * we can add support for synonyms/wdf/etc at both index and query time (there 
 are tradeoffs here, and this is not implemented!)
 * this is a basis for more complicated suggesters such as Japanese 
 suggesters, where the analyzed form is in fact the reading,
   so we would add a TokenFilter that copies ReadingAttribute into term text 
 to support that...
 * other general things like offering suggestions that are more fuzzy like 
 using a plural stemmer or ignoring accents or whatever.
 According to my benchmarks, suggestions are still very fast with the 
 prototype (e.g. ~ 100,000 QPS), and the FST size does not
 explode (its short of twice that of a regular wFST, but this is still far 
 smaller than TST or JaSpell, etc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1337005 - in /lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene: index/AlcoholicMergePolicy.java util/LuceneTestCase.java

2012-05-11 Thread Martijn van Groningen

There is room for improvement :)
Yeah we should introduce an amount and order of drinks.
Also the drinking speed is important and whether non-alcoholic
beverages are consumed during the night.

Martijn

On 11 May 2012 1growFactor2:33, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 +  public static enum Drink {
 +
 +    Beer(15), Wine(17), Champagne(21), WhiteRussian(22),
 + SingleMalt(30);
 +
 +
 +    public long drunk() {
 +      return drunkFactor;
 +    }

 I think this isn't an independent value. This isn't even a Markov
 chain as it doesn't depend on the last state of the observed object
 and the drink to follow -- full history of drinks consumed so far
 would have to be considered, their order and quantities matter (i.e.,
 beer after champagne, singlemalt after beer etc.). Overflows (or
 so-called burst points) would certainly have to be empirically
 established as there is no theoretical model for them known in
 literature...

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-- 
Met vriendelijke groet,

Martijn van Groningen

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4026) TestIndexWriterReader hang

2012-05-11 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4026:


Attachment: LUCENE-4026.patch

here is a new patch with a testcase that exercises this code more. I got this 
test to fail 1 in 20K runs with the bug and it didn't fail with the fix. 
Still not an evidence but better than no test, that's for sure.

 TestIndexWriterReader hang
 --

 Key: LUCENE-4026
 URL: https://issues.apache.org/jira/browse/LUCENE-4026
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Simon Willnauer
 Attachments: LUCENE-4026.patch, LUCENE-4026.patch


 hung in jenkins. seed is  D344294F98D3F637 (and the usual nightly flags, 
 -Dtests.nightly=true, -Dtests.multiplier=3, -Dtests.linedocsfile=huge)
 Didn't try to reproduce yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3451) Solrj QueryResponse as JSON response

2012-05-11 Thread Pavan Kumar (JIRA)

Pavan Kumar created SOLR-3451:
-

 Summary: Solrj QueryResponse as JSON response
 Key: SOLR-3451
 URL: https://issues.apache.org/jira/browse/SOLR-3451
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Reporter: Pavan Kumar


Hi,
I have a requiremnt to get Solrj QueryResponse as a JSON response.
As per the Solrj API it's supporting BinaryResponseParser and XMLResponseParser.
Is there any way to get the Solrj QueryResponse as a JSON response.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4040) Improve QueryParser and supported syntax documentation


[ 
https://issues.apache.org/jira/browse/LUCENE-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273336#comment-13273336
 ] 

Robert Muir commented on LUCENE-4040:
-

Thanks Mike, I committed this!

 Improve QueryParser and supported syntax documentation
 --

 Key: LUCENE-4040
 URL: https://issues.apache.org/jira/browse/LUCENE-4040
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/queryparser
Reporter: Chris Male
Priority: Minor
 Attachments: LUCENE-4040.patch, LUCENE-4040.patch


 In LUCENE-4024 there were some changes to the fuzzy query syntax.  Only the 
 Classic QueryParser really documents its syntax, which makes it hard to know 
 whether the changes effected other QPs.  Compounding this issue there are 
 many classes which have no javadocs at all and I found myself quite confused 
 when I consolidated all the QPs into their module.
 We should do a concerted effort to improve the documentation so that it is 
 clear what syntax is supported by what QPs and so that at least the user 
 facing classes have javadocs.  
 As part of this, I wonder whether we should give the syntax supported by the 
 Classic QueryParser a new name (rather than just Lucene's query syntax) since 
 other QPs can and do support other syntax, and then somehow add some typed 
 control over this, so QPs have to declare programmatically that they support 
 the syntax and so we can verify that by randomly plugging in implementations 
 into tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13960 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13960/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=79 closes=77

Stack Trace:
java.lang.AssertionError: ERROR: SolrIndexSearcher opens=79 closes=77
at __randomizedtesting.SeedInfo.seed([1DE6EE9D997C002D]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 11192 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3842) Analyzing Suggester

2012-05-11 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3842:
---

Attachment: LUCENE-3842.patch

Patch, fixing TS2A to insert holes ... this is causing the 
AnalyzingCompletionTest.testStandard to fail... we have to fix its query-time 
to insert holes too...

 Analyzing Suggester
 ---

 Key: LUCENE-3842
 URL: https://issues.apache.org/jira/browse/LUCENE-3842
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, 
 LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, 
 LUCENE-3842.patch


 Since we added shortest-path wFSA search in LUCENE-3714, and generified the 
 comparator in LUCENE-3801,
 I think we should look at implementing suggesters that have more capabilities 
 than just basic prefix matching.
 In particular I think the most flexible approach is to integrate with 
 Analyzer at both build and query time,
 such that we build a wFST with:
 input: analyzed text such as ghost0christmas0past -- byte 0 here is an 
 optional token separator
 output: surface form such as the ghost of christmas past
 weight: the weight of the suggestion
 we make an FST with PairOutputsweight,output, but only do the shortest path 
 operation on the weight side (like
 the test in LUCENE-3801), at the same time accumulating the output (surface 
 form), which will be the actual suggestion.
 This allows a lot of flexibility:
 * Using even standardanalyzer means you can offer suggestions that ignore 
 stopwords, e.g. if you type in ghost of chr...,
   it will suggest the ghost of christmas past
 * we can add support for synonyms/wdf/etc at both index and query time (there 
 are tradeoffs here, and this is not implemented!)
 * this is a basis for more complicated suggesters such as Japanese 
 suggesters, where the analyzed form is in fact the reading,
   so we would add a TokenFilter that copies ReadingAttribute into term text 
 to support that...
 * other general things like offering suggestions that are more fuzzy like 
 using a plural stemmer or ignoring accents or whatever.
 According to my benchmarks, suggestions are still very fast with the 
 prototype (e.g. ~ 100,000 QPS), and the FST size does not
 explode (its short of twice that of a regular wFST, but this is still far 
 smaller than TST or JaSpell, etc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1535) Pre-analyzed field type

2012-05-11 Thread Neil Hooey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273408#comment-13273408
 ] 

Neil Hooey commented on SOLR-1535:
--

When I asked Hoss at Lucene Revolution yesterday, he said you could manually 
set _term frequency_ in a pre-analyzed field, but I couldn't find any reference 
to it in the JSON parser.

Is there a way to specify term frequency for each term in the field?

 Pre-analyzed field type
 ---

 Key: SOLR-1535
 URL: https://issues.apache.org/jira/browse/SOLR-1535
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0

 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, 
 preanalyzed.patch, preanalyzed.patch


 PreAnalyzedFieldType provides a functionality to index (and optionally store) 
 content that was already processed and split into tokens using some external 
 processing chain. This implementation defines a serialization format for 
 sending tokens with any currently supported Attributes (eg. type, posIncr, 
 payload, ...). This data is de-serialized into a regular TokenStream that is 
 returned in Field.tokenStreamValue() and thus added to the index as index 
 terms, and optionally a stored part that is returned in Field.stringValue() 
 and is then added as a stored value of the field.
 This field type is useful for integrating Solr with existing text-processing 
 pipelines, such as third-party NLP systems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-11 Thread Oleg Shevelyov (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273417#comment-13273417
]

Oleg Shevelyov commented on SOLR-2155:
--

Hi David, does new 1.0.5 version include polygon search? If not, please, could
you clarify where to apply GeoHashPrefixFilter patch? It doesn't match solr 3.1
sources, and obviously higher versions as well. I saw you mentioned that you
successfully implemented polygon search but I still don't get how to make it
work. Thanks

Geospatial search using geohash prefixes

Key: SOLR-2155
URL: https://issues.apache.org/jira/browse/SOLR-2155
Project: Solr
Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch,
GeoHashPrefixFilter.patch,
SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch,
SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip,
Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip,
Solr2155-for-1.0.2-3.x-port.patch

{panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x
located here: https://github.com/dsmiley/SOLR-2155. Look at the introductory
readme and download the plugin .jar file. Lucene 4's new spatial module is
largely based on this code. The Solr 4 glue for it should come very soon but
as of this writing it's hosted temporarily at https://github.com/spatial4j.
For more information on using SOLR-2155 with Solr 3, see
http://wiki.apache.org/solr/SpatialSearch#SOLR-2155 This JIRA issue is
closed because it won't be committed in its current form.
{panel}
There currently isn't a solution in Solr for doing geospatial filtering on
documents that have a variable number of points. This scenario occurs when
there is location extraction (i.e. via a gazateer) occurring on free text.
None, one, or many geospatial locations might be extracted from any given
document and users want to limit their search results to those occurring in a
user-specified area.
I've implemented this by furthering the GeoHash based work in Lucene/Solr
with a geohash prefix based filter. A geohash refers to a lat-lon box on the
earth. Each successive character added further subdivides the box into a 4x8
(or 8x4 depending on the even/odd length of the geohash) grid. The first
step in this scheme is figuring out which geohash grid squares cover the
user's search query. I've added various extra methods to GeoHashUtils (and
added tests) to assist in this purpose. The next step is an actual Lucene
Filter, GeoHashPrefixFilter, that uses these geohash prefixes in
TermsEnum.seek() to skip to relevant grid squares in the index. Once a
matching geohash grid is found, the points therein are compared against the
user's query to see if it matches. I created an abstraction GeoShape
extended by subclasses named PointDistance... and CartesianBox to support
different queried shapes so that the filter need not care about these details.
This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3842) Analyzing Suggester


[ 
https://issues.apache.org/jira/browse/LUCENE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273419#comment-13273419
 ] 

Robert Muir commented on LUCENE-3842:
-

testStandard is also bogus: it has 2 asserts. the first one should pass, but 
the second one should
really only work if you disable positionincrements in the (mock) stopfilter.

 Analyzing Suggester
 ---

 Key: LUCENE-3842
 URL: https://issues.apache.org/jira/browse/LUCENE-3842
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3842-TokenStream_to_Automaton.patch, 
 LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, LUCENE-3842.patch, 
 LUCENE-3842.patch


 Since we added shortest-path wFSA search in LUCENE-3714, and generified the 
 comparator in LUCENE-3801,
 I think we should look at implementing suggesters that have more capabilities 
 than just basic prefix matching.
 In particular I think the most flexible approach is to integrate with 
 Analyzer at both build and query time,
 such that we build a wFST with:
 input: analyzed text such as ghost0christmas0past -- byte 0 here is an 
 optional token separator
 output: surface form such as the ghost of christmas past
 weight: the weight of the suggestion
 we make an FST with PairOutputsweight,output, but only do the shortest path 
 operation on the weight side (like
 the test in LUCENE-3801), at the same time accumulating the output (surface 
 form), which will be the actual suggestion.
 This allows a lot of flexibility:
 * Using even standardanalyzer means you can offer suggestions that ignore 
 stopwords, e.g. if you type in ghost of chr...,
   it will suggest the ghost of christmas past
 * we can add support for synonyms/wdf/etc at both index and query time (there 
 are tradeoffs here, and this is not implemented!)
 * this is a basis for more complicated suggesters such as Japanese 
 suggesters, where the analyzed form is in fact the reading,
   so we would add a TokenFilter that copies ReadingAttribute into term text 
 to support that...
 * other general things like offering suggestions that are more fuzzy like 
 using a plural stemmer or ignoring accents or whatever.
 According to my benchmarks, suggestions are still very fast with the 
 prototype (e.g. ~ 100,000 QPS), and the FST size does not
 explode (its short of twice that of a regular wFST, but this is still far 
 smaller than TST or JaSpell, etc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2507 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2507/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch

Error Message:
Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #1,6,]

Stack Trace:
java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
Thread[Lucene Merge Thread #1,6,]
at 
com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:849)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:688)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:724)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:735)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
org.apache.lucene.store.AlreadyClosedException: this Directory is closed
at __randomizedtesting.SeedInfo.seed([FB767E191B0392E0]:0)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
closed
at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244)
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241)
at 
org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3019)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)




Build Log (for compile errors):
[...truncated 12364 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2371) Update fileformats spec to match how flex's standard codec writes terms


 [ 
https://issues.apache.org/jira/browse/LUCENE-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-2371.
-

Resolution: Fixed

 Update fileformats spec to match how flex's standard codec writes terms
 ---

 Key: LUCENE-2371
 URL: https://issues.apache.org/jira/browse/LUCENE-2371
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/website
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 The standard codec changes how the terms index is written (eg uses packed 
 ints, writes a whole field's terms at once, etc.)... we have to fix file 
 formats on the web site to match.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3374) HttpClient jar not included in distribution

2012-05-11 Thread Sami Siren (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-3374.
--

Resolution: Fixed

 HttpClient jar not included in distribution
 ---

 Key: SOLR-3374
 URL: https://issues.apache.org/jira/browse/SOLR-3374
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 3.6
Reporter: Roger Håkansson
Assignee: Sami Siren
Priority: Minor
 Fix For: 3.6.1

 Attachments: SOLR-3374.patch


 In 3.6 CommonsHttpSolrServer is deprecated in favor for HttpSolrServer 
 however in the distribution under solrj-lib, non of the required jar files 
 for HttpClient 4.x is included

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13963 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13963/

1 tests failed.
REGRESSION:  
org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete

Error Message:
searcher529 wasn't soon enough after soft529: 1336760134608 ! 1336760134169 + 
100 (fudge)

Stack Trace:
java.lang.AssertionError: searcher529 wasn't soon enough after soft529: 
1336760134608 ! 1336760134169 + 100 (fudge)
at 
__randomizedtesting.SeedInfo.seed([D97244FD57C53365:1E3EFC604C6DFED5]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.update.SoftAutoCommitTest.testSoftAndHardCommitMaxTimeDelete(SoftAutoCommitTest.java:250)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:806)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:867)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:881)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:774)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:696)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:629)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:668)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:813)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:688)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:724)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:735)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 10378 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1535) Pre-analyzed field type

2012-05-11 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273501#comment-13273501
 ] 

Andrzej Bialecki  commented on SOLR-1535:
-

Hoss was wrong :) there is no way to do this, as there is no way to do this in 
TokenStream - you should view the PreAnalyzed field type as a serialized 
TokenStream (with the added functionality to specify the stored part 
independently).

 Pre-analyzed field type
 ---

 Key: SOLR-1535
 URL: https://issues.apache.org/jira/browse/SOLR-1535
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0

 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, 
 preanalyzed.patch, preanalyzed.patch


 PreAnalyzedFieldType provides a functionality to index (and optionally store) 
 content that was already processed and split into tokens using some external 
 processing chain. This implementation defines a serialization format for 
 sending tokens with any currently supported Attributes (eg. type, posIncr, 
 payload, ...). This data is de-serialized into a regular TokenStream that is 
 returned in Field.tokenStreamValue() and thus added to the index as index 
 terms, and optionally a stored part that is returned in Field.stringValue() 
 and is then added as a stored value of the field.
 This field type is useful for integrating Solr with existing text-processing 
 pipelines, such as third-party NLP systems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable

2012-05-11 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273514#comment-13273514
 ] 

Markus Jelsma commented on SOLR-3221:
-

I would agree that latency is preferred as default.

 Make Shard handler threadpool configurable
 --

 Key: SOLR-3221
 URL: https://issues.apache.org/jira/browse/SOLR-3221
 Project: Solr
  Issue Type: Improvement
Affects Versions: 3.6, 4.0
Reporter: Greg Bowyer
Assignee: Erick Erickson
  Labels: distributed, http, shard
 Fix For: 3.6, 4.0

 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
 SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
 SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, 
 SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch


 From profiling of monitor contention, as well as observations of the
 95th and 99th response times for nodes that perform distributed search
 (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code
 currently does a suboptimal job of managing outgoing shard level
 requests.
 Presently the code contained within lucene 3.5's SearchHandler and
 Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in
 order to service distributed search requests. This is done presently to
 limit the size of the threadpool such that it does not consume resources
 in deployment configurations that do not use distributed search.
 This unfortunately has two impacts on the response time if the node
 coordinating the distribution is under high load.
 The usage of the MaxConnectionsPerHost configuration option results in
 aggressive activity on semaphores within HttpCommons, it has been
 observed that the aggregator can have a response time far greater than
 that of the searchers. The above monitor contention would appear to
 suggest that in some cases its possible for liveness issues to occur and
 for simple queries to be starved of resources simply due to a lack of
 attention from the viewpoint of context switching.
 With, as mentioned above the http commons connection being hotly
 contended
 The fair, queue based configuration eliminates this, at the cost of
 throughput.
 This patch aims to make the threadpool largely configurable allowing for
 those using solr to choose the throughput vs latency balance they
 desire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-1535) Pre-analyzed field type

2012-05-11 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273501#comment-13273501
 ] 

Andrzej Bialecki  edited comment on SOLR-1535 at 5/11/12 7:20 PM:
--

Hoss was wrong :) there is no way to do this, as there is no way to do this in 
TokenStream - you should view the PreAnalyzed field type as a serialized 
TokenStream (with the added functionality to specify the stored part 
independently).

Edit: I started adding some documentation to 
http://wiki.apache.org/solr/PreAnalyzedField .

  was (Author: ab):
Hoss was wrong :) there is no way to do this, as there is no way to do this 
in TokenStream - you should view the PreAnalyzed field type as a serialized 
TokenStream (with the added functionality to specify the stored part 
independently).
  
 Pre-analyzed field type
 ---

 Key: SOLR-1535
 URL: https://issues.apache.org/jira/browse/SOLR-1535
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0

 Attachments: SOLR-1535.patch, SOLR-1535.patch, SOLR-1535.patch, 
 preanalyzed.patch, preanalyzed.patch


 PreAnalyzedFieldType provides a functionality to index (and optionally store) 
 content that was already processed and split into tokens using some external 
 processing chain. This implementation defines a serialization format for 
 sending tokens with any currently supported Attributes (eg. type, posIncr, 
 payload, ...). This data is de-serialized into a regular TokenStream that is 
 returned in Field.tokenStreamValue() and thus added to the index as index 
 terms, and optionally a stored part that is returned in Field.stringValue() 
 and is then added as a stored value of the field.
 This field type is useful for integrating Solr with existing text-processing 
 pipelines, such as third-party NLP systems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts


[ 
https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273525#comment-13273525
 ] 

Michael Wyraz commented on LUCENE-4049:
---

Robert, could you please explain it a bit more (maybe with the code above)? I 
wonder why PrefixQuery behaves unlike the other query types there.

 PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
 -

 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.5, 3.6
 Environment: Java
Reporter: Michael Wyraz

 It is possible to set boost to fields or documents during indexing, so 
 certain documents can be boostes over others. This works well with TermQuery 
 or FuzzyQuery but not with PrefixQuery which ignores the individual values.
 Test Code below:
 import java.io.IOException;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.document.Document;
 import org.apache.lucene.document.Field;
 import org.apache.lucene.index.IndexReader;
 import org.apache.lucene.index.IndexWriter;
 import org.apache.lucene.index.IndexWriterConfig;
 import org.apache.lucene.index.Term;
 import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.PrefixQuery;
 import org.apache.lucene.search.Query;
 import org.apache.lucene.search.ScoreDoc;
 import org.apache.lucene.search.TopScoreDocCollector;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.RAMDirectory;
 import org.apache.lucene.util.Version;
 import com.evermind.tools.calendar.StopWatch;
 public class LuceneTest
 {
 public static void main(String[] args) throws Exception
 {
 Directory index=new RAMDirectory();
 StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
 IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
 analyzer);
 
 IndexWriter w = new IndexWriter(index, config);
 addDoc(w, Hello 1,1);
 addDoc(w, Hello 2,2);
 addDoc(w, Hello 3,1);
 w.close();
 StopWatch.stop();
 
 IndexReader reader = IndexReader.open(index);
 IndexSearcher searcher = new IndexSearcher(reader);
 
 //Query q = new TermQuery(new Term(f1,hello));
 Query q = new PrefixQuery(new Term(f1,hello));
 
 TopScoreDocCollector collector = TopScoreDocCollector.create(10, 
 true);
 searcher.search(q, collector);
 for (ScoreDoc hit: collector.topDocs().scoreDocs)
 {
 Document d = searcher.doc(hit.doc);
 System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
 }
 }
 
 private static void addDoc(IndexWriter w, String value, float boost) 
 throws IOException
 {
 Document doc = new Document();
 doc.add(new Field(f1, value, Field.Store.YES, 
 Field.Index.ANALYZED));
 doc.setBoost(boost);
 w.addDocument(doc);
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts

2012-05-11 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273536#comment-13273536
 ] 

Simon Willnauer commented on LUCENE-4049:
-

The default rewrite method in PrefixQuery / MTQ is ConstantScore ie. it will 
create a constant score query return 1.0f for all matching documents. if you 
want a real scoring query change the rewrite method 
(MultiTermQuery#setRewriteMethod()) to 
MultiTermQuery#SCORING_BOOLEAN_QUERY_REWRITE. 

that is btw. true for all MTQ subclasses.

 PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
 -

 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.5, 3.6
 Environment: Java
Reporter: Michael Wyraz

 It is possible to set boost to fields or documents during indexing, so 
 certain documents can be boostes over others. This works well with TermQuery 
 or FuzzyQuery but not with PrefixQuery which ignores the individual values.
 Test Code below:
 import java.io.IOException;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.document.Document;
 import org.apache.lucene.document.Field;
 import org.apache.lucene.index.IndexReader;
 import org.apache.lucene.index.IndexWriter;
 import org.apache.lucene.index.IndexWriterConfig;
 import org.apache.lucene.index.Term;
 import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.PrefixQuery;
 import org.apache.lucene.search.Query;
 import org.apache.lucene.search.ScoreDoc;
 import org.apache.lucene.search.TopScoreDocCollector;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.RAMDirectory;
 import org.apache.lucene.util.Version;
 import com.evermind.tools.calendar.StopWatch;
 public class LuceneTest
 {
 public static void main(String[] args) throws Exception
 {
 Directory index=new RAMDirectory();
 StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
 IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
 analyzer);
 
 IndexWriter w = new IndexWriter(index, config);
 addDoc(w, Hello 1,1);
 addDoc(w, Hello 2,2);
 addDoc(w, Hello 3,1);
 w.close();
 StopWatch.stop();
 
 IndexReader reader = IndexReader.open(index);
 IndexSearcher searcher = new IndexSearcher(reader);
 
 //Query q = new TermQuery(new Term(f1,hello));
 Query q = new PrefixQuery(new Term(f1,hello));
 
 TopScoreDocCollector collector = TopScoreDocCollector.create(10, 
 true);
 searcher.search(q, collector);
 for (ScoreDoc hit: collector.topDocs().scoreDocs)
 {
 Document d = searcher.doc(hit.doc);
 System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
 }
 }
 
 private static void addDoc(IndexWriter w, String value, float boost) 
 throws IOException
 {
 Document doc = new Document();
 doc.add(new Field(f1, value, Field.Store.YES, 
 Field.Index.ANALYZED));
 doc.setBoost(boost);
 w.addDocument(doc);
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-11 Thread David Smiley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273540#comment-13273540
]

David Smiley commented on SOLR-2155:

Hi Oleg. No, it was stripped out a long while ago. But come to think of it,
now that this issue isn't going to get committed and is also hosted somewhere
outside Apache (it's on GitHub), I can re-introduce the polygon support that
was formerly there. It's not a priority for me right now but if you find the
last .patch file on this issue that includes the JTS support (which in some
comment above I mentioned stripping it out so you could grab the version prior
to that), then you could resurrect it. There was just one source file, plus a
small hook into my query parser above. JTS did all the work, really. If you
want to try and bring it back, then do so and send me a pull-request on github.
All said and done, it's a very small amount of work; the integration was done
it just needs to be brought back.

Geospatial search using geohash prefixes

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts


[ 
https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273572#comment-13273572
 ] 

Michael Wyraz commented on LUCENE-4049:
---

Thank you, this solved the problem.
But I had to set it explicitely for PrefixQuery, so this is not the default 
there.

 PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
 -

 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.5, 3.6
 Environment: Java
Reporter: Michael Wyraz

 It is possible to set boost to fields or documents during indexing, so 
 certain documents can be boostes over others. This works well with TermQuery 
 or FuzzyQuery but not with PrefixQuery which ignores the individual values.
 Test Code below:
 import java.io.IOException;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.document.Document;
 import org.apache.lucene.document.Field;
 import org.apache.lucene.index.IndexReader;
 import org.apache.lucene.index.IndexWriter;
 import org.apache.lucene.index.IndexWriterConfig;
 import org.apache.lucene.index.Term;
 import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.PrefixQuery;
 import org.apache.lucene.search.Query;
 import org.apache.lucene.search.ScoreDoc;
 import org.apache.lucene.search.TopScoreDocCollector;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.RAMDirectory;
 import org.apache.lucene.util.Version;
 import com.evermind.tools.calendar.StopWatch;
 public class LuceneTest
 {
 public static void main(String[] args) throws Exception
 {
 Directory index=new RAMDirectory();
 StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
 IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
 analyzer);
 
 IndexWriter w = new IndexWriter(index, config);
 addDoc(w, Hello 1,1);
 addDoc(w, Hello 2,2);
 addDoc(w, Hello 3,1);
 w.close();
 StopWatch.stop();
 
 IndexReader reader = IndexReader.open(index);
 IndexSearcher searcher = new IndexSearcher(reader);
 
 //Query q = new TermQuery(new Term(f1,hello));
 Query q = new PrefixQuery(new Term(f1,hello));
 
 TopScoreDocCollector collector = TopScoreDocCollector.create(10, 
 true);
 searcher.search(q, collector);
 for (ScoreDoc hit: collector.topDocs().scoreDocs)
 {
 Document d = searcher.doc(hit.doc);
 System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
 }
 }
 
 private static void addDoc(IndexWriter w, String value, float boost) 
 throws IOException
 {
 Document doc = new Document();
 doc.add(new Field(f1, value, Field.Store.YES, 
 Field.Index.ANALYZED));
 doc.setBoost(boost);
 w.addDocument(doc);
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-4049) PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts


[ 
https://issues.apache.org/jira/browse/LUCENE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273572#comment-13273572
 ] 

Michael Wyraz edited comment on LUCENE-4049 at 5/11/12 8:18 PM:


Thank you, this solved the problem.


  was (Author: mich...@wyraz.de):
Thank you, this solved the problem.
But I had to set it explicitely for PrefixQuery, so this is not the default 
there.
  
 PrefixQuery (or it's superclass MultiTermQuery) ignores index time boosts
 -

 Key: LUCENE-4049
 URL: https://issues.apache.org/jira/browse/LUCENE-4049
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 3.5, 3.6
 Environment: Java
Reporter: Michael Wyraz

 It is possible to set boost to fields or documents during indexing, so 
 certain documents can be boostes over others. This works well with TermQuery 
 or FuzzyQuery but not with PrefixQuery which ignores the individual values.
 Test Code below:
 import java.io.IOException;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.document.Document;
 import org.apache.lucene.document.Field;
 import org.apache.lucene.index.IndexReader;
 import org.apache.lucene.index.IndexWriter;
 import org.apache.lucene.index.IndexWriterConfig;
 import org.apache.lucene.index.Term;
 import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.PrefixQuery;
 import org.apache.lucene.search.Query;
 import org.apache.lucene.search.ScoreDoc;
 import org.apache.lucene.search.TopScoreDocCollector;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.RAMDirectory;
 import org.apache.lucene.util.Version;
 import com.evermind.tools.calendar.StopWatch;
 public class LuceneTest
 {
 public static void main(String[] args) throws Exception
 {
 Directory index=new RAMDirectory();
 StandardAnalyzer analyzer=new StandardAnalyzer(Version.LUCENE_35);
 IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_35, 
 analyzer);
 
 IndexWriter w = new IndexWriter(index, config);
 addDoc(w, Hello 1,1);
 addDoc(w, Hello 2,2);
 addDoc(w, Hello 3,1);
 w.close();
 StopWatch.stop();
 
 IndexReader reader = IndexReader.open(index);
 IndexSearcher searcher = new IndexSearcher(reader);
 
 //Query q = new TermQuery(new Term(f1,hello));
 Query q = new PrefixQuery(new Term(f1,hello));
 
 TopScoreDocCollector collector = TopScoreDocCollector.create(10, 
 true);
 searcher.search(q, collector);
 for (ScoreDoc hit: collector.topDocs().scoreDocs)
 {
 Document d = searcher.doc(hit.doc);
 System.err.println(d.get(f1)+ +hit.score+ +hit.doc);
 }
 }
 
 private static void addDoc(IndexWriter w, String value, float boost) 
 throws IOException
 {
 Document doc = new Document();
 doc.add(new Field(f1, value, Field.Store.YES, 
 Field.Index.ANALYZED));
 doc.setBoost(boost);
 w.addDocument(doc);
 }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX


 [ 
https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-723:
-

Attachment: (was: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch)

 SolrCore  aliasing/swapping may lead to confusing JMX
 --

 Key: SOLR-723
 URL: https://issues.apache.org/jira/browse/SOLR-723
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Henri Biestro
Priority: Minor
 Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch


 As mentioned by Yonik in SOLR-647, JMX registers the core with its name.
 After swapping or re-aliasing the core, the JMX tracking name does not 
 correspond to the actual core anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX


 [ 
https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-723:
-

Attachment: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch

 SolrCore  aliasing/swapping may lead to confusing JMX
 --

 Key: SOLR-723
 URL: https://issues.apache.org/jira/browse/SOLR-723
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Henri Biestro
Priority: Minor
 Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch


 As mentioned by Yonik in SOLR-647, JMX registers the core with its name.
 After swapping or re-aliasing the core, the JMX tracking name does not 
 correspond to the actual core anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-723) SolrCore aliasing/swapping may lead to confusing JMX


 [ 
https://issues.apache.org/jira/browse/SOLR-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated SOLR-723:
-

Attachment: (was: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch)

 SolrCore  aliasing/swapping may lead to confusing JMX
 --

 Key: SOLR-723
 URL: https://issues.apache.org/jira/browse/SOLR-723
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Henri Biestro
Priority: Minor
 Attachments: SOLR-723-solr-core-swap-JMX-issues-lucene_3x.patch


 As mentioned by Yonik in SOLR-647, JMX registers the core with its name.
 After swapping or re-aliasing the core, the JMX tracking name does not 
 correspond to the actual core anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3489) Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes


 [ 
https://issues.apache.org/jira/browse/LUCENE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3489:


Attachment: LUCENE-3489.patch

attached is a patch generalizing the UseNoExpensiveMemory annotation to 
@AvoidCodecs that takes a list of codecs to avoid.

This way, tests that cannot work with Lucene3x codec can just avoid it, using 
another codec, rather than assuming (in general its bad that many of the tests 
of actual new functionality often dont run at all because of the current 
assumes) 

 Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to 
 use new annotation and move the expensive methods to separate classes
 

 Key: LUCENE-3489
 URL: https://issues.apache.org/jira/browse/LUCENE-3489
 Project: Lucene - Java
  Issue Type: Test
  Components: general/test
Affects Versions: 4.0
Reporter: Uwe Schindler
 Fix For: 4.1

 Attachments: LUCENE-3489.patch


 Folloup for LUCENE-3463.
 TODO:
 - Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation 
 to separate classes
 - Eliminate the assumeFalse-calls that check the current codec and disable 
 the test if SimpleText or Memory is used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13970 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13970/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=80 closes=78

Stack Trace:
java.lang.AssertionError: ERROR: SolrIndexSearcher opens=80 closes=78
at __randomizedtesting.SeedInfo.seed([35A799C36F061472]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 11327 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3171) BlockJoinQuery/Collector

2012-05-11 Thread David Webb (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273759#comment-13273759
 ] 

David Webb commented on LUCENE-3171:


Is there a wiki page on how to use this?  I need to implement an index with 
nested docs and an example scheme and query would be awesome. Thanks!

 BlockJoinQuery/Collector
 

 Key: LUCENE-3171
 URL: https://issues.apache.org/jira/browse/LUCENE-3171
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Reporter: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3171.patch, LUCENE-3171.patch, LUCENE-3171.patch


 I created a single-pass Query + Collector to implement nested docs.
 The approach is similar to LUCENE-2454, in that the app must index
 documents in join order, as a block (IW.add/updateDocuments), with
 the parent doc at the end of the block, except that this impl is one
 pass.
 Once you join at indexing time, you can take any query that matches
 child docs and join it up to the parent docID space, using
 BlockJoinQuery.  You then use BlockJoinCollector, which sorts parent
 docs by provided Sort, to gather results, grouped by parent; this
 collector finds any BlockJoinQuerys (using Scorer.visitScorers) and
 retains the child docs corresponding to each collected parent doc.
 After searching is done, you retrieve the TopGroups from a provided
 BlockJoinQuery.
 Like LUCENE-2454, this is less general than the arbitrary joins in
 Solr (SOLR-2272) or parent/child from ElasticSearch
 (https://github.com/elasticsearch/elasticsearch/issues/553), since you
 must do the join at indexing time as a doc block, but it should be
 able to handle nested joins as well as joins to multiple tables,
 though I don't yet have test cases for these.
 I put this in a new Join module (modules/join); I think as we
 refactor join impls we should put them here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable