[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760932#comment-13760932 ] Hardik Upadhyay commented on SOLR-5217: --- Actual use cases of nested entities are relevant to there parent entity only. if parent entity changes so do nested entity values too.off course in few use cases nested entity values may also remain same. But CachedSqlEntityProcessor should and if i am not wrong caches where conditions from queries.Same should be the case with stored procedures , as input parameters changes it should cache result set for each new parameters too.Then only CachedSqlEntityProcessor can serve its purpose. > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760932#comment-13760932 ] Hardik Upadhyay edited comment on SOLR-5217 at 9/7/13 6:01 AM: --- Actual use cases of nested entities are relevant to their parent entity only. if parent entity changes so do nested entity values too.off course in few use cases nested entity values may also remain same. But CachedSqlEntityProcessor should and if i am not wrong caches where conditions from queries.Same should be the case with stored procedures , as input parameters changes it should cache result set for each new parameters too.Then only CachedSqlEntityProcessor can serve its purpose. was (Author: hupadhyay): Actual use cases of nested entities are relevant to there parent entity only. if parent entity changes so do nested entity values too.off course in few use cases nested entity values may also remain same. But CachedSqlEntityProcessor should and if i am not wrong caches where conditions from queries.Same should be the case with stored procedures , as input parameters changes it should cache result set for each new parameters too.Then only CachedSqlEntityProcessor can serve its purpose. > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 106 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/106/ No tests ran. Build Log: [...truncated 34242 lines...] prepare-release-no-sign: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease [copy] Copying 416 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene [copy] Copying 194 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7 [exec] NOTE: output encoding is US-ASCII [exec] [exec] Load release URL "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/"... [exec] [exec] Test Lucene... [exec] test basics... [exec] get KEYS [exec] 0.1 MB in 0.01 sec (10.1 MB/sec) [exec] check changes HTML... [exec] download lucene-4.5.0-src.tgz... [exec] 27.1 MB in 0.04 sec (681.4 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-4.5.0.tgz... [exec] 49.0 MB in 0.07 sec (690.3 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-4.5.0.zip... [exec] 58.9 MB in 0.11 sec (516.1 MB/sec) [exec] verify md5/sha1 digests [exec] unpack lucene-4.5.0.tgz... [exec] verify JAR/WAR metadata... [exec] test demo with 1.6... [exec] got 5723 hits for query "lucene" [exec] test demo with 1.7... [exec] got 5723 hits for query "lucene" [exec] check Lucene's javadoc JAR [exec] [exec] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html [exec] broken details HTML: Method Detail: addAttributeImpl: closing "" does not match opening "" [exec] broken details HTML: Method Detail: getAttribute: closing "" does not match opening "" [exec] Traceback (most recent call last): [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 1450, in [exec] main() [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 1394, in main [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, testArgs) [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 1431, in smokeTest [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, version, testArgs) [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 607, in unpackAndVerify [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, version, testArgs) [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 786, in verifyUnpacked [exec] checkJavadocpath('%s/docs' % unpackPath) [exec] File "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py", line 904, in checkJavadocpath [exec] raise RuntimeError('missing javadocs package summaries!') [exec] RuntimeError: missing javadocs package summaries! BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:321: exec returned: 1 Total time: 19 minutes 30 seconds Build step 'Invoke Ant' marked build as failure Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5198) Strengthen the function of Min should match, making it select BooleanClause as Occur.MUST according to the weight of query
[ https://issues.apache.org/jira/browse/LUCENE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HeXin updated LUCENE-5198: -- Description: In current version, when we using BooleanQuery do disjunction, the top scorer will select the doc which meet at least mm numbers of sub scorers. But in some case, we wish that the weight of sub scorers larger than the threshold can be selected as Occur.MUST automatically. The threshold can be configurable, equaling the minimum integer by default. Any comments is welcomed. was: In some case, we want the value of mm to select BooleanClause as Occur.MUST can according to the weight of query. Only if the weight larger than the threshold, it can be selected as Occur.MUST. The threshold can be configurable, equaling the minimum integer by default. Any comments is welcomed. > Strengthen the function of Min should match, making it select BooleanClause > as Occur.MUST according to the weight of query > -- > > Key: LUCENE-5198 > URL: https://issues.apache.org/jira/browse/LUCENE-5198 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Affects Versions: 4.4 >Reporter: HeXin >Priority: Trivial > > In current version, when we using BooleanQuery do disjunction, the top scorer > will select the doc which meet > at least mm numbers of sub scorers. > But in some case, we wish that the weight of sub scorers larger than the > threshold can be selected > as Occur.MUST automatically. The threshold can be configurable, equaling the > minimum integer by default. > Any comments is welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760815#comment-13760815 ] Feihong Huang edited comment on SOLR-5215 at 9/7/13 12:45 AM: -- Thanks to Ricard to find the reason. I also encounter this issue in our production application servers. was (Author: ainihong001): Thanks to Ricard to finding the reason. I also encounter this issue in our production application servers. > Deadlock in Solr Cloud ConnectionManager > > > Key: SOLR-5215 > URL: https://issues.apache.org/jira/browse/SOLR-5215 > Project: Solr > Issue Type: Bug > Components: clients - java, SolrCloud >Affects Versions: 4.2.1 > Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 > x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_18" > Java(TM) SE Runtime Environment (build 1.6.0_18-b07) > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) >Reporter: Ricardo Merizalde > > We are constantly seeing a deadlocks in our production application servers. > The problem seems to be that a thread A: > - tries to process an event and acquires the ConnectionManager lock > - the update callback acquires connectionUpdateLock and invokes > waitForConnected > - waitForConnected tries to acquire the ConnectionManager lock (which already > has) > - waitForConnected calls wait and release the ConnectionManager lock (but > still has the connectionUpdateLock) > The a thread B: > - tries to process an event and acquires the ConnectionManager lock > - the update call back tries to acquire connectionUpdateLock but gets blocked > holding the ConnectionManager lock and preventing thread A from getting out > of the wait state. > > Here is part of the thread dump: > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x59965800 > nid=0x3e81 waiting for monitor entry [0x57169000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) > - waiting to lock <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x5ad4 > nid=0x3e67 waiting for monitor entry [0x4dbd4000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - waiting to lock <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x2aac4c2f7000 > nid=0x3d9a waiting for monitor entry [0x42821000] >java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - locked <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > Found one Java-level deadlock: > = > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.ConnectionManager), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock moni
[jira] [Commented] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760815#comment-13760815 ] Feihong Huang commented on SOLR-5215: - Thanks to Ricard to finding the reason. I also encounter this issue in our production application servers. > Deadlock in Solr Cloud ConnectionManager > > > Key: SOLR-5215 > URL: https://issues.apache.org/jira/browse/SOLR-5215 > Project: Solr > Issue Type: Bug > Components: clients - java, SolrCloud >Affects Versions: 4.2.1 > Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 > x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_18" > Java(TM) SE Runtime Environment (build 1.6.0_18-b07) > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) >Reporter: Ricardo Merizalde > > We are constantly seeing a deadlocks in our production application servers. > The problem seems to be that a thread A: > - tries to process an event and acquires the ConnectionManager lock > - the update callback acquires connectionUpdateLock and invokes > waitForConnected > - waitForConnected tries to acquire the ConnectionManager lock (which already > has) > - waitForConnected calls wait and release the ConnectionManager lock (but > still has the connectionUpdateLock) > The a thread B: > - tries to process an event and acquires the ConnectionManager lock > - the update call back tries to acquire connectionUpdateLock but gets blocked > holding the ConnectionManager lock and preventing thread A from getting out > of the wait state. > > Here is part of the thread dump: > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x59965800 > nid=0x3e81 waiting for monitor entry [0x57169000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) > - waiting to lock <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x5ad4 > nid=0x3e67 waiting for monitor entry [0x4dbd4000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - waiting to lock <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x2aac4c2f7000 > nid=0x3d9a waiting for monitor entry [0x42821000] >java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - locked <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > Found one Java-level deadlock: > = > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.ConnectionManager), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a > java.lang.Object), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c76
[jira] [Created] (SOLR-5220) Marking server as zombie due to 4xx response is odd
Jessica Cheng created SOLR-5220: --- Summary: Marking server as zombie due to 4xx response is odd Key: SOLR-5220 URL: https://issues.apache.org/jira/browse/SOLR-5220 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.4 Reporter: Jessica Cheng In LBHttpSolrServer.request, a request is retried and server marked as zombie if the return code is 404, 403, 503, or 500, and the comment says "we retry on 404 or 403 or 503 - you can see this on solr shutdown". I think returning a 503 on a shutdown is reasonable, but not 4xx, which is supposed to be a client error. But even if this is can't be fixed systematically on the server-side, seems like on the client side we can retry on another server, but not mark the current server as dead, because most likely when the server returns a 403 (Forbidden) or 404 (Not Found), it's not because it's dead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4296 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4296/ All tests passed Build Log: [...truncated 35271 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:396: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:335: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:66: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:139: The following files are missing svn:eol-style (or binary svn:mime-type): * ./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml Total time: 82 minutes 19 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 7347 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7347/ Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 31814 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:396: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:335: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:66: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:139: The following files are missing svn:eol-style (or binary svn:mime-type): * ./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml Total time: 51 minutes 32 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760560#comment-13760560 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520677 from [~dawidweiss] in branch 'dev/trunk' [ https://svn.apache.org/r1520677 ] SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Polished clustering configuration examples. > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
[ https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-5218. --- Resolution: Won't Fix Assignee: Dawid Weiss > Unable to extend SolrJettyTestBase within a Parametrized test > - > > Key: SOLR-5218 > URL: https://issues.apache.org/jira/browse/SOLR-5218 > Project: Solr > Issue Type: Bug > Components: Tests >Affects Versions: 4.3.1 >Reporter: Steve Davids >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > > I would like to create a unit test that extends SolrJettyTestBase using the > JUnit Parameterized test format. When I try to run the test I get the > following messages: > Method beforeClass() should be public & Method afterClass() should be public > at java.lang.reflect.Constructor.newInstance(Unkown Source)... > Obviously it would be great if we could make those public so I can use the > JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
[ https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760577#comment-13760577 ] Dawid Weiss commented on SOLR-5218: --- We use a runner that does not follow all of JUnit conventions (and there are reason why it doesn't). JUnit requires all hooks to be public methods but this leads to accidental overrides and missed super calls. In RandomizedRunner a private hook is always called, regardless of the shadowing/ override. If you want to use a parameterized test, use RandomizedRunner's factory instead, as is shown here: https://github.com/carrotsearch/randomizedtesting/blob/master/examples/maven/src/main/java/com/carrotsearch/examples/randomizedrunner/Test007ParameterizedTests.java > Unable to extend SolrJettyTestBase within a Parametrized test > - > > Key: SOLR-5218 > URL: https://issues.apache.org/jira/browse/SOLR-5218 > Project: Solr > Issue Type: Bug > Components: Tests >Affects Versions: 4.3.1 >Reporter: Steve Davids > Fix For: 4.5, 5.0 > > > I would like to create a unit test that extends SolrJettyTestBase using the > JUnit Parameterized test format. When I try to run the test I get the > following messages: > Method beforeClass() should be public & Method afterClass() should be public > at java.lang.reflect.Constructor.newInstance(Unkown Source)... > Obviously it would be great if we could make those public so I can use the > JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760566#comment-13760566 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520681 from [~dawidweiss] in branch 'dev/trunk' [ https://svn.apache.org/r1520681 ] SOLR-5202: follow-up to CHANGES.txt > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760563#comment-13760563 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520678 from [~dawidweiss] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520678 ] SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Polished clustering configuration examples. > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-5202. --- Resolution: Fixed > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5219) Refactor selection of the default clustering algorithm
Dawid Weiss created SOLR-5219: - Summary: Refactor selection of the default clustering algorithm Key: SOLR-5219 URL: https://issues.apache.org/jira/browse/SOLR-5219 Project: Solr Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.5, 5.0 This is currently quite messy: the user needs to explicitly name the 'default' algorithm. The logic should be: 1) if there's only one algorithm, it becomes the default, 2) if there's more than one algorithm, the first one becomes the default one. 3) for back-compat, if there is an algorithm called 'default', it does become the default one. The code will simplify a great deal too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
Steve Davids created SOLR-5218: -- Summary: Unable to extend SolrJettyTestBase within a Parametrized test Key: SOLR-5218 URL: https://issues.apache.org/jira/browse/SOLR-5218 Project: Solr Issue Type: Bug Components: Tests Affects Versions: 4.3.1 Reporter: Steve Davids Fix For: 4.5, 5.0 I would like to create a unit test that extends SolrJettyTestBase using the JUnit Parameterized test format. When I try to run the test I get the following messages: Method beforeClass() should be public & Method afterClass() should be public at java.lang.reflect.Constructor.newInstance(Unkown Source)... Obviously it would be great if we could make those public so I can use the JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760569#comment-13760569 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520683 from [~dawidweiss] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520683 ] SOLR-5202: follow-up to CHANGES.txt > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760528#comment-13760528 ] ASF subversion and git services commented on SOLR-2548: --- Commit 1520670 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520670 ] SOLR-2548, Multithread faceting > Multithreaded faceting > -- > > Key: SOLR-2548 > URL: https://issues.apache.org/jira/browse/SOLR-2548 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1 >Reporter: Janne Majaranta >Assignee: Erick Erickson >Priority: Minor > Labels: facet > Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, > SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, > SOLR-2548.patch, SOLR-2548.patch > > > Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760516#comment-13760516 ] Tim Vaillancourt edited comment on SOLR-5216 at 9/6/13 7:01 PM: Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After about 6 hours of stress-testing this patch, we see many of these stalled transactions (below), and the Solr instances start to see each other as down, flooding our Solr logs with "Connection Refused" exceptions, and otherwise no obviously-useful logs that I could see. I did notice some stalled transactions on both /select and /update, however. This never occurred without this patch. Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script "normalizes" the ERROR-severity stack traces and returns them in order of occurrence. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt was (Author: tvaillancourt): Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After a few hours of this testing, we see many of these stalled transactions, and the solr instances start to see each other as down, flooding our solr logs with Connection Refused exceptions, and otherwise no useful logs (that I could see). Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script "normalizes" the ERROR-severity stack traces and returns them in order of ocurrance. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt > Document updates to SolrCloud can cause a distributed deadlock. > --- > > Key: SOLR-5216 > URL: https://issues.apache.org/jira/browse/SOLR-5216 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Critical > Fix For: 4.5, 5.0 > > Attachments: SOLR-5216.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760516#comment-13760516 ] Tim Vaillancourt commented on SOLR-5216: Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After a few hours of this testing, we see many of these stalled transactions, and the solr instances start to see each other as down, flooding our solr logs with Connection Refused exceptions, and otherwise no useful logs (that I could see). Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script "normalizes" the ERROR-severity stack traces and returns them in order of ocurrance. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt > Document updates to SolrCloud can cause a distributed deadlock. > --- > > Key: SOLR-5216 > URL: https://issues.apache.org/jira/browse/SOLR-5216 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Critical > Fix For: 4.5, 5.0 > > Attachments: SOLR-5216.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-2548. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Thanks Janne and Gun! > Multithreaded faceting > -- > > Key: SOLR-2548 > URL: https://issues.apache.org/jira/browse/SOLR-2548 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1 >Reporter: Janne Majaranta >Assignee: Erick Erickson >Priority: Minor > Labels: facet > Fix For: 4.5, 5.0 > > Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, > SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, > SOLR-2548.patch, SOLR-2548.patch > > > Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 368 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/368/ All tests passed Build Log: [...truncated 3511 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.docFreq, b.docFreq); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: cannot find symbol [javac] symbol : method compare(long,long) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 errors BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:409: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:382: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:39: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build.xml:551: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1887: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/module-build.xml:58: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:477: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1625: Compile failed; see the compiler error output for details. Total time: 39 minutes 17 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760427#comment-13760427 ] ASF subversion and git services commented on SOLR-2548: --- Commit 1520645 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1520645 ] SOLR-2548, Multithread faceting > Multithreaded faceting > -- > > Key: SOLR-2548 > URL: https://issues.apache.org/jira/browse/SOLR-2548 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1 >Reporter: Janne Majaranta >Assignee: Erick Erickson >Priority: Minor > Labels: facet > Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, > SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, > SOLR-2548.patch, SOLR-2548.patch > > > Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5197) Add a method to SegmentReader to get the current index heap memory size
[ https://issues.apache.org/jira/browse/LUCENE-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5197: Attachment: LUCENE-5197.patch Some minor cleanups / improvements: Fixed calculations for all-in-ram DV impls: for the esoteric/deprecated ones, it just uses RUE rather than making the code complicated. Facet42 is easy though and accounts correctly now. Added missing null check for VariableGapReader's FST (it can happen when there are no terms). > Add a method to SegmentReader to get the current index heap memory size > --- > > Key: LUCENE-5197 > URL: https://issues.apache.org/jira/browse/LUCENE-5197 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs, core/index >Reporter: Areek Zillur > Attachments: LUCENE-5197.patch, LUCENE-5197.patch, LUCENE-5197.patch, > LUCENE-5197.patch, LUCENE-5197.patch > > > It would be useful to at least estimate the index heap size being used by > Lucene. Ideally a method exposing this information at the SegmentReader level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760294#comment-13760294 ] ASF subversion and git services commented on LUCENE-5200: - Commit 1520615 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1520615 ] LUCENE-5200: HighFreqTerms has confusing behavior with -t option > HighFreqTerms has confusing behavior with -t option > --- > > Key: LUCENE-5200 > URL: https://issues.apache.org/jira/browse/LUCENE-5200 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Reporter: Robert Muir > Attachments: LUCENE-5200.patch > > > {code} > * HighFreqTerms class extracts the top n most frequent terms > * (by document frequency) from an existing Lucene index and reports their > * document frequency. > * > * If the -t flag is given, both document frequency and total tf (total > * number of occurrences) are reported, ordered by descending total tf. > {code} > Problem #1: > Its tricky what happens with -t: if you ask for the top-100 terms, it > requests the top-100 terms (by docFreq), then resorts the top-N by > totalTermFreq. > So its not really the top 100 most frequently occurring terms. > Problem #2: > Using the -t option can be confusing and slow: the reported docFreq includes > deletions, but totalTermFreq does not (it actually walks postings lists if > there is even one deletion). > I think this is a relic from 3.x days when lucene did not support this > statistic. I think we should just always output both TermsEnum.docFreq() and > TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure
Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/ No tests ran. Build Log: [...truncated 8808 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.docFreq, b.docFreq); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: cannot find symbol [javac] symbol : method compare(long,long) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 errors BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625: Compile failed; see the compiler error output for details. Total time: 1 minute 14 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Publishing Javadoc Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #439: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/439/ No tests ran. Build Log: [...truncated 3328 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure
java6 doesnt have this: i committed a fix. On Fri, Sep 6, 2013 at 10:12 AM, Apache Jenkins Server wrote: > Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/ > > No tests ran. > > Build Log: > [...truncated 8808 lines...] > [javac] Compiling 20 source files to > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java > [javac] > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: > cannot find symbol > [javac] symbol : method compare(int,int) > [javac] location: class java.lang.Long > [javac] int res = Long.compare(a.docFreq, b.docFreq); > [javac] ^ > [javac] > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: > cannot find symbol > [javac] symbol : method compare(long,long) > [javac] location: class java.lang.Long > [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); > [javac] ^ > [javac] Note: Some input files use or override a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > [javac] 2 errors > > BUILD FAILED > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625: > Compile failed; see the compiler error output for details. > > Total time: 1 minute 14 seconds > Build step 'Invoke Ant' marked build as failure > Archiving artifacts > Publishing Javadoc > Email was triggered for: Failure > Sending email for trigger: Failure > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760296#comment-13760296 ] ASF subversion and git services commented on LUCENE-5200: - Commit 1520616 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520616 ] LUCENE-5200: HighFreqTerms has confusing behavior with -t option > HighFreqTerms has confusing behavior with -t option > --- > > Key: LUCENE-5200 > URL: https://issues.apache.org/jira/browse/LUCENE-5200 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Reporter: Robert Muir > Attachments: LUCENE-5200.patch > > > {code} > * HighFreqTerms class extracts the top n most frequent terms > * (by document frequency) from an existing Lucene index and reports their > * document frequency. > * > * If the -t flag is given, both document frequency and total tf (total > * number of occurrences) are reported, ordered by descending total tf. > {code} > Problem #1: > Its tricky what happens with -t: if you ask for the top-100 terms, it > requests the top-100 terms (by docFreq), then resorts the top-N by > totalTermFreq. > So its not really the top 100 most frequently occurring terms. > Problem #2: > Using the -t option can be confusing and slow: the reported docFreq includes > deletions, but totalTermFreq does not (it actually walks postings lists if > there is even one deletion). > I think this is a relic from 3.x days when lucene did not support this > statistic. I think we should just always output both TermsEnum.docFreq() and > TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5200. - Resolution: Fixed Fix Version/s: 4.5 5.0 > HighFreqTerms has confusing behavior with -t option > --- > > Key: LUCENE-5200 > URL: https://issues.apache.org/jira/browse/LUCENE-5200 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Reporter: Robert Muir > Fix For: 5.0, 4.5 > > Attachments: LUCENE-5200.patch > > > {code} > * HighFreqTerms class extracts the top n most frequent terms > * (by document frequency) from an existing Lucene index and reports their > * document frequency. > * > * If the -t flag is given, both document frequency and total tf (total > * number of occurrences) are reported, ordered by descending total tf. > {code} > Problem #1: > Its tricky what happens with -t: if you ask for the top-100 terms, it > requests the top-100 terms (by docFreq), then resorts the top-N by > totalTermFreq. > So its not really the top 100 most frequently occurring terms. > Problem #2: > Using the -t option can be confusing and slow: the reported docFreq includes > deletions, but totalTermFreq does not (it actually walks postings lists if > there is even one deletion). > I think this is a relic from 3.x days when lucene did not support this > statistic. I think we should just always output both TermsEnum.docFreq() and > TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760304#comment-13760304 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1520618 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1520618 ] LUCENE-3069: reuse customized TermState in PBF > Lucene should have an entirely memory resident term dictionary > -- > > Key: LUCENE-3069 > URL: https://issues.apache.org/jira/browse/LUCENE-3069 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0-ALPHA >Reporter: Simon Willnauer >Assignee: Han Jiang > Labels: gsoc2013 > Fix For: 5.0, 4.5 > > Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch > > > FST based TermDictionary has been a great improvement yet it still uses a > delta codec file for scanning to terms. Some environments have enough memory > available to keep the entire FST based term dict in memory. We should add a > TermDictionary implementation that encodes all needed information for each > term into the FST (custom fst.Output) and builds a FST from the entire term > not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760303#comment-13760303 ] Shalin Shekhar Mangar commented on SOLR-5217: - I don't think this is a bug. CachedSqlEntityProcessor will execute the query only once and that is its USP. If you don't want the caching, then just use SqlEntityProcessor. > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760328#comment-13760328 ] Michael McCandless commented on LUCENE-3069: Thanks Han. I think we can just leave the .smy as is for now, and keep passing "boolean absolute" down. We can later improve these ... I think we should first land this on trunk and let jenkins chew on it for a while ... and if all seems good, then back port. > Lucene should have an entirely memory resident term dictionary > -- > > Key: LUCENE-3069 > URL: https://issues.apache.org/jira/browse/LUCENE-3069 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0-ALPHA >Reporter: Simon Willnauer >Assignee: Han Jiang > Labels: gsoc2013 > Fix For: 5.0, 4.5 > > Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch > > > FST based TermDictionary has been a great improvement yet it still uses a > delta codec file for scanning to terms. Some environments have enough memory > available to keep the entire FST based term dict in memory. We should add a > TermDictionary implementation that encodes all needed information for each > term into the FST (custom fst.Output) and builds a FST from the entire term > not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760325#comment-13760325 ] Han Jiang commented on LUCENE-3069: --- I think this is ready to commit to trunk now, and I'll wait for a day or two before committing it. :) > Lucene should have an entirely memory resident term dictionary > -- > > Key: LUCENE-3069 > URL: https://issues.apache.org/jira/browse/LUCENE-3069 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0-ALPHA >Reporter: Simon Willnauer >Assignee: Han Jiang > Labels: gsoc2013 > Fix For: 5.0, 4.5 > > Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch > > > FST based TermDictionary has been a great improvement yet it still uses a > delta codec file for scanning to terms. Some environments have enough memory > available to keep the entire FST based term dict in memory. We should add a > TermDictionary implementation that encodes all needed information for each > term into the FST (custom fst.Output) and builds a FST from the entire term > not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760259#comment-13760259 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1520592 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1520592 ] LUCENE-3069: remove impersonate codes, fix typo > Lucene should have an entirely memory resident term dictionary > -- > > Key: LUCENE-3069 > URL: https://issues.apache.org/jira/browse/LUCENE-3069 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0-ALPHA >Reporter: Simon Willnauer >Assignee: Han Jiang > Labels: gsoc2013 > Fix For: 5.0, 4.5 > > Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch > > > FST based TermDictionary has been a great improvement yet it still uses a > delta codec file for scanning to terms. Some environments have enough memory > available to keep the entire FST based term dict in memory. We should add a > TermDictionary implementation that encodes all needed information for each > term into the FST (custom fst.Output) and builds a FST from the entire term > not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5216: -- Priority: Critical (was: Major) > Document updates to SolrCloud can cause a distributed deadlock. > --- > > Key: SOLR-5216 > URL: https://issues.apache.org/jira/browse/SOLR-5216 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Critical > Fix For: 4.5, 5.0 > > Attachments: SOLR-5216.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5024) java client(solrj 4.1.0) can not get the ngroup number.
[ https://issues.apache.org/jira/browse/SOLR-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760239#comment-13760239 ] Sandro Mario Zbinden commented on SOLR-5024: This error exists too in Solr 4.2. > java client(solrj 4.1.0) can not get the ngroup number. > --- > > Key: SOLR-5024 > URL: https://issues.apache.org/jira/browse/SOLR-5024 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 4.1 >Reporter: sun >Priority: Minor > Labels: none > Original Estimate: 10m > Remaining Estimate: 10m > > when adding these > parameters(group=true&group.field=topicid&group.ngroups=true&group.format=simple > ) to solrj, i can not get the group number. > it's easy to fix it. at line 221 of queryresponse.java, an if-else should be > here, just like those from line 203 to 208. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760221#comment-13760221 ] Mark Miller commented on SOLR-4817: --- I think it's all a bit of a mess right now (the test configs situation) - we should clean this up more. I intend to take a crack at it at some point. It's still too haphazard what is done in what tests and too difficult to understand and follow when writing new tests or debugging old ones. > Solr should not fall back to the back compat built in solr.xml in SolrCloud > mode. > - > > Key: SOLR-4817 > URL: https://issues.apache.org/jira/browse/SOLR-4817 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Erick Erickson >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, > SOLR-4817.patch, SOLR-4817.patch > > > A hard error is much more useful, and this built in solr.xml is not very good > for solrcloud - with the old style solr.xml with cores in it, you won't have > persistence and with the new style, it's not really ideal either. > I think it makes it easier to debug solr.home to fail on this instead - but > just in solrcloud mode for now due to back compat. We might want to pull the > whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760226#comment-13760226 ] Erick Erickson commented on SOLR-4817: -- bq: I think it's all a bit of a mess right now Yeah, it certainly is but I haven't had the energy to try to straighten it out either. Maybe we can share some of the work > Solr should not fall back to the back compat built in solr.xml in SolrCloud > mode. > - > > Key: SOLR-4817 > URL: https://issues.apache.org/jira/browse/SOLR-4817 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Erick Erickson >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, > SOLR-4817.patch, SOLR-4817.patch > > > A hard error is much more useful, and this built in solr.xml is not very good > for solrcloud - with the old style solr.xml with cores in it, you won't have > persistence and with the new style, it's not really ideal either. > I think it makes it easier to debug solr.home to fail on this instead - but > just in solrcloud mode for now due to back compat. We might want to pull the > whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-2548: - Attachment: SOLR-2548.patch Final patch, including CHANGES.txt entry. > Multithreaded faceting > -- > > Key: SOLR-2548 > URL: https://issues.apache.org/jira/browse/SOLR-2548 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1 >Reporter: Janne Majaranta >Assignee: Erick Erickson >Priority: Minor > Labels: facet > Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, > SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, > SOLR-2548.patch, SOLR-2548.patch > > > Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760160#comment-13760160 ] Han Jiang commented on LUCENE-3069: --- Mike, thanks for the review! bq. In general, couldn't the writer re-use the reader's TermState? I'm afraid this somewhat makes codes longer? I'll make a patch to see this. {quote} Have you run "first do no harm" perf tests? Ie, compare current trunk w/ default Codec to branch w/ default Codec? Just to make sure there are no surprises... {quote} Yes, no surprise yet. bq. Why does Lucene41PostingsWriter have "impersonation" code? Yeah, these should be removed. {quote} I forget: why does the postings reader/writer need to handle delta coding again (take an absolute boolean argument)? Was it because of pulsing or sep? It's fine for now (progress not perfection) ... but not clean, since "delta coding" is really an encoding detail so in theory the terms dict should "own" that ... {quote} Ah, yes, because of pulsing. This is because.. PulsingPostingsBase is more than a PostingsBaseFormat. It somewhat acts like a term dict, e.g. it needs to understand how terms are structured in one block (term No.1 uses absolute value, term No.x use delta value) then judge how to restruct the inlined and wrapped block (No.1 still uses absolute value, but the first-non-pulsed term will need absolute encoding as well). Without the argument 'absolute', the real term dictionary will do the delta encoding itself, then PulsingPostingsBase will be confused, and all wrapped PostingsBase have to encode metadata values without delta-format. {quote} The new .smy file for Pulsing is sort of strange ... but necessary since it always uses 0 longs, so we have to store this somewhere ... you could put it into FieldInfo attributes instead? {quote} Yeah, it is another hairy thing... the reason is, we don't have a 'PostingsTrailer' for PostingsBaseFormat. Pulsing will not know the longs size for each field, until all the fields are consumed... and it should not write those longsSize to termsOut in close() since the term dictionary will use the DirTrailer hack here. (maybe every term dictionary should close postingsWriter first, then write field summary and close itself? I'm not sure though). bq. Should we backport this to 4.x? Yeah, OK! > Lucene should have an entirely memory resident term dictionary > -- > > Key: LUCENE-3069 > URL: https://issues.apache.org/jira/browse/LUCENE-3069 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0-ALPHA >Reporter: Simon Willnauer >Assignee: Han Jiang > Labels: gsoc2013 > Fix For: 5.0, 4.5 > > Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, > LUCENE-3069.patch > > > FST based TermDictionary has been a great improvement yet it still uses a > delta codec file for scanning to terms. Some environments have enough memory > available to keep the entire FST based term dict in memory. We should add a > TermDictionary implementation that encodes all needed information for each > term into the FST (custom fst.Output) and builds a FST from the entire term > not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760145#comment-13760145 ] Dawid Weiss edited comment on SOLR-5202 at 9/6/13 11:36 AM: Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. Another thing is that LEXICAL_RESOURCES_DIR no longer reflects the true purpose of that folder... perhaps it should be aliased to something more sensible. was (Author: dweiss): Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760145#comment-13760145 ] Dawid Weiss commented on SOLR-5202: --- Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-5202: -- Attachment: SOLR-5202.patch > Support easier overrides of Carrot2 clustering attributes via XML data sets > exported from the Workbench. > > > Key: SOLR-5202 > URL: https://issues.apache.org/jira/browse/SOLR-5202 > Project: Solr > Issue Type: New Feature >Reporter: Dawid Weiss >Assignee: Dawid Weiss > Fix For: 4.5, 5.0 > > Attachments: SOLR-5202.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760138#comment-13760138 ] ASF subversion and git services commented on LUCENE-4734: - Commit 1520544 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520544 ] Revert LUCENE-4734. > FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight > > > Key: LUCENE-4734 > URL: https://issues.apache.org/jira/browse/LUCENE-4734 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter >Affects Versions: 4.0, 4.1, 5.0 >Reporter: Ryan Lauck >Assignee: Adrien Grand > Labels: fastvectorhighlighter, highlighter > Fix For: 5.0, 4.5 > > Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch > > > If a proximity phrase query overlaps with any other query term it will not be > highlighted. > Example Text: A B C D E F G > Example Queries: > "B E"~10 D > (D will be highlighted instead of "B C D E") > "B E"~10 "C F"~10 > (nothing will be highlighted) > This can be traced to the FieldPhraseList constructor's inner while loop. > From the first example query, the first TermInfo popped off the stack will be > "B". The second TermInfo will be "D" which will not be found in the submap > for "B E"~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760118#comment-13760118 ] ASF subversion and git services commented on LUCENE-4734: - Commit 1520536 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1520536 ] Revert LUCENE-4734. > FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight > > > Key: LUCENE-4734 > URL: https://issues.apache.org/jira/browse/LUCENE-4734 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter >Affects Versions: 4.0, 4.1, 5.0 >Reporter: Ryan Lauck >Assignee: Adrien Grand > Labels: fastvectorhighlighter, highlighter > Fix For: 5.0, 4.5 > > Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch > > > If a proximity phrase query overlaps with any other query term it will not be > highlighted. > Example Text: A B C D E F G > Example Queries: > "B E"~10 D > (D will be highlighted instead of "B C D E") > "B E"~10 "C F"~10 > (nothing will be highlighted) > This can be traced to the FieldPhraseList constructor's inner while loop. > From the first example query, the first TermInfo popped off the stack will be > "B". The second TermInfo will be "D" which will not be found in the submap > for "B E"~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5101. -- Resolution: Fixed Fix Version/s: 4.5 5.0 Committed, thanks Robert! > make it easier to plugin different bitset implementations to > CachingWrapperFilter > - > > Key: LUCENE-5101 > URL: https://issues.apache.org/jira/browse/LUCENE-5101 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Fix For: 5.0, 4.5 > > Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, > LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch > > > Currently this is possible, but its not so friendly: > {code} > protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) > throws IOException { > if (docIdSet == null) { > // this is better than returning null, as the nonnull result can be > cached > return EMPTY_DOCIDSET; > } else if (docIdSet.isCacheable()) { > return docIdSet; > } else { > final DocIdSetIterator it = docIdSet.iterator(); > // null is allowed to be returned by iterator(), > // in this case we wrap with the sentinel set, > // which is cacheable. > if (it == null) { > return EMPTY_DOCIDSET; > } else { > /* INTERESTING PART */ > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(it); > return bits; > /* END INTERESTING PART */ > } > } > } > {code} > Is there any value to having all this other logic in the protected API? It > seems like something thats not useful for a subclass... Maybe this stuff can > become final, and "INTERESTING PART" calls a simpler method, something like: > {code} > protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(iterator); > return bits; > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760104#comment-13760104 ] ASF subversion and git services commented on LUCENE-5101: - Commit 1520527 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520527 ] LUCENE-5101: Make it easier to plugin different bitset implementations to CachingWrapperFilter. > make it easier to plugin different bitset implementations to > CachingWrapperFilter > - > > Key: LUCENE-5101 > URL: https://issues.apache.org/jira/browse/LUCENE-5101 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, > LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch > > > Currently this is possible, but its not so friendly: > {code} > protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) > throws IOException { > if (docIdSet == null) { > // this is better than returning null, as the nonnull result can be > cached > return EMPTY_DOCIDSET; > } else if (docIdSet.isCacheable()) { > return docIdSet; > } else { > final DocIdSetIterator it = docIdSet.iterator(); > // null is allowed to be returned by iterator(), > // in this case we wrap with the sentinel set, > // which is cacheable. > if (it == null) { > return EMPTY_DOCIDSET; > } else { > /* INTERESTING PART */ > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(it); > return bits; > /* END INTERESTING PART */ > } > } > } > {code} > Is there any value to having all this other logic in the protected API? It > seems like something thats not useful for a subclass... Maybe this stuff can > become final, and "INTERESTING PART" calls a simpler method, something like: > {code} > protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(iterator); > return bits; > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 372 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/372/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestRollingUpdates.testRollingUpdates Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0) at org.apache.lucene.util.fst.BytesStore.(BytesStore.java:62) at org.apache.lucene.util.fst.FST.(FST.java:366) at org.apache.lucene.util.fst.FST.(FST.java:301) at org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.(MemoryPostingsFormat.java:799) at org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:194) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:233) at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:128) at org.apache.lucene.index.SegmentReader.(SegmentReader.java:56) at org.apache.lucene.index.ReadersAndLiveDocs.getReader(ReadersAndLiveDocs.java:111) at org.apache.lucene.index.ReadersAndLiveDocs.getReadOnlyClone(ReadersAndLiveDocs.java:166) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:97) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377) at org.apache.lucene.index.TestRollingUpdates.testRollingUpdates(TestRollingUpdates.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) Build Log: [...truncated 282 lines...] [junit4] Suite: org.apache.lucene.index.TestRollingUpdates [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestRollingUpdates -Dtests.method=testRollingUpdates -Dtests.seed=5535725D2A0C4F09 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt -Dtests.locale=cs -Dtests.timezone=Etc/GMT -Dtests.file.encoding=US-ASCII [junit4] ERROR 21.9s J0 | TestRollingUpdates.testRollingUpdates <<< [junit4]> Throwable #1: java.lang.OutOfMemoryError: Java heap space [junit4]>at __randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0) [junit4]>at org.apache.lucene.util.fst.BytesStore.(BytesStore.java:62) [junit4]>at org.apache.lucene.util.fst.FST.(FST.java:366) [junit4]>at org.apache.lucene.util.fst.FST.(FST.java:301) [junit4]>at org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.(MemoryPostingsFormat.java:799) [junit4]>at org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861) [junit4]>at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:194) [junit4]>at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsForma
[jira] [Updated] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hardik Upadhyay updated SOLR-5217: -- Attachment: db-data-config.xml > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5217) CachedSqlEntity fails with stored procedure
Hardik Upadhyay created SOLR-5217: - Summary: CachedSqlEntity fails with stored procedure Key: SOLR-5217 URL: https://issues.apache.org/jira/browse/SOLR-5217 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Hardik Upadhyay When using DIH with CachedSqlEntityProcessor and importing data from MS-sql using stored procedures, it imports data for nested entities only once and then every call with different arguments for nested entities are only served from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760102#comment-13760102 ] ASF subversion and git services commented on LUCENE-5101: - Commit 1520525 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1520525 ] LUCENE-5101: Make it easier to plugin different bitset implementations to CachingWrapperFilter. > make it easier to plugin different bitset implementations to > CachingWrapperFilter > - > > Key: LUCENE-5101 > URL: https://issues.apache.org/jira/browse/LUCENE-5101 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, > LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch > > > Currently this is possible, but its not so friendly: > {code} > protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) > throws IOException { > if (docIdSet == null) { > // this is better than returning null, as the nonnull result can be > cached > return EMPTY_DOCIDSET; > } else if (docIdSet.isCacheable()) { > return docIdSet; > } else { > final DocIdSetIterator it = docIdSet.iterator(); > // null is allowed to be returned by iterator(), > // in this case we wrap with the sentinel set, > // which is cacheable. > if (it == null) { > return EMPTY_DOCIDSET; > } else { > /* INTERESTING PART */ > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(it); > return bits; > /* END INTERESTING PART */ > } > } > } > {code} > Is there any value to having all this other logic in the protected API? It > seems like something thats not useful for a subclass... Maybe this stuff can > become final, and "INTERESTING PART" calls a simpler method, something like: > {code} > protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { > final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); > bits.or(iterator); > return bits; > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760081#comment-13760081 ] Simon Willnauer commented on LUCENE-4734: - bq. The real question is: does it make more sense to invest time in LUCENE-2878 rather than further complicating FVH? FVH works great for simple phrase and single term queries but it has so many corner cases.. +1 lets do it +1 to revert the change! > FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight > > > Key: LUCENE-4734 > URL: https://issues.apache.org/jira/browse/LUCENE-4734 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter >Affects Versions: 4.0, 4.1, 5.0 >Reporter: Ryan Lauck >Assignee: Adrien Grand > Labels: fastvectorhighlighter, highlighter > Fix For: 5.0, 4.5 > > Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch > > > If a proximity phrase query overlaps with any other query term it will not be > highlighted. > Example Text: A B C D E F G > Example Queries: > "B E"~10 D > (D will be highlighted instead of "B C D E") > "B E"~10 "C F"~10 > (nothing will be highlighted) > This can be traced to the FieldPhraseList constructor's inner while loop. > From the first example query, the first TermInfo popped off the stack will be > "B". The second TermInfo will be "D" which will not be found in the submap > for "B E"~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5201) UIMAUpdateRequestProcessor should reuse the AnalysisEngine
[ https://issues.apache.org/jira/browse/SOLR-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760038#comment-13760038 ] Jun Ohtani commented on SOLR-5201: -- Thanks Tommaso. Sorry, I misunderstood about the relationship between UIMAUpdateRequestProcessorFactory and AnalysisEngine. My co-woker use this patch, it work without problems. Do you commit the above patch to branch_4x? > UIMAUpdateRequestProcessor should reuse the AnalysisEngine > -- > > Key: SOLR-5201 > URL: https://issues.apache.org/jira/browse/SOLR-5201 > Project: Solr > Issue Type: Improvement > Components: contrib - UIMA >Affects Versions: 4.4 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili > Fix For: 4.5, 5.0 > > Attachments: SOLR-5201-ae-cache-every-request_branch_4x.patch, > SOLR-5201-ae-cache-only-single-request_branch_4x.patch > > > As reported in http://markmail.org/thread/2psiyl4ukaejl4fx > UIMAUpdateRequestProcessor instantiates an AnalysisEngine for each request > which is bad for performance therefore it'd be nice if such AEs could be > reused whenever that's possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5057) Hunspell stemmer generates multiple tokens
[ https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760034#comment-13760034 ] Lukas Vlcek commented on LUCENE-5057: - Agree Chris. Thanks. > Hunspell stemmer generates multiple tokens > -- > > Key: LUCENE-5057 > URL: https://issues.apache.org/jira/browse/LUCENE-5057 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: 4.3 >Reporter: Luca Cavanna >Assignee: Adrien Grand > > The hunspell stemmer seems to be generating multiple tokens: the original > token plus the available stems. > It might be a good thing in some cases but it seems to be a different > behaviour compared to the other stemmers and causes problems as well. I would > rather have an option to decide whether it should output only the available > stems, or the stems plus the original token. I'm not sure though if it's > possible to have only a single stem indexed, which would be even better in my > opinion. When I look at how snowball works only one token is indexed, the > stem, and that works great. Probably there's something I'm missing in how > hunspell works. > Here is my issue: I have a query composed of multiple terms, which is > analyzed using stemming and a boolean query is generated out of it. All fine > when adding all clauses as should (OR operator), but if I add all clauses as > must (AND operator), then I can get back only the documents that contain the > stem originated by the exactly same original word. > Example for the dutch language I'm working with: fiets (means bicycle in > dutch), its plural is fietsen. > If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index > "fiets" I get the only "fiets" indexed. > When I query for "fietsen whatever" I get the following boolean query: > field:fiets field:fietsen field:whatever. > If I apply the AND operator and use must clauses for each subquery, then I > can only find the documents that originally contained "fietsen", not the ones > that originally contained "fiets", which is not really what stemming is about. > Any thoughts on this? I also wonder if it can be a dictionary issue since I > see that different words that have the word "fiets" as root don't get the > same stems, and using the AND operator at query time is a big issue. > I would love to contribute on this and looking forward to your feedback. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3765) Wrong handling of documents with same id in cross collection searches
[ https://issues.apache.org/jira/browse/SOLR-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760023#comment-13760023 ] Furkan KAMACI commented on SOLR-3765: - Did anything have done for this issue? > Wrong handling of documents with same id in cross collection searches > - > > Key: SOLR-3765 > URL: https://issues.apache.org/jira/browse/SOLR-3765 > Project: Solr > Issue Type: Bug > Components: search, SolrCloud >Affects Versions: 4.0 > Environment: Self-build version of Solr fra 4.x branch (revision ) >Reporter: Per Steffensen > Labels: collections, inconsistency, numFound, search > > Dialog with myself from solr-users mailing list: > Per Steffensen skrev: > {quote} > Hi > Due to what we have seen in recent tests I got in doubt how Solr search is > actually supposed to behave > * Searching with "distrib=true&q=*:*&rows=10&collection=x,y,z&sort=timestamp > asc" > ** Is Solr supposed to return the 10 documents with the lowest timestamp > across all documents in all slices of collection x, y and z, or is it > supposed to just pick 10 random documents from those slices and just sort > those 10 randomly selected documents? > ** Put in another way - is this search supposed to be consistent, returning > exactly the same set of documents when performed several times (no documents > are updated between consecutive searches)? > {quote} > Fortunately I believe the answer is, that it ought to "return the 10 > documents with the lowest timestamp across all documents in all slices of > collection x, y and Z". The reason I asked was because I got different > responses for consecutive simular requests. Now I believe it can be explained > by the bug described below. I guess they you do cross-collection/shard > searches, the "request-handling" Solr forwards the query to all involved > shards simultanious and merges sub-results into the final result as they are > returned from the shards. Because of the "consider documents with same id as > the same document even though the come from different collections"-bug it is > kinda random (depending on which shards responds first/last), for a given id, > what collection the document with that specific id is taken from. And if > documents with the same id from different collections has different timestamp > it is random where that document ends up in the final sorted result. > So i believe this inconsistency can be explained by the bug described below. > {quote} > * A search returns a "numFound"-field telling how many documents all in all > matches the search-criteria, even though not all those documents are returned > by the search. It is a crazy question to ask, but I will do it anyway because > we actually see a problem with this. Isnt it correct that two searches which > only differs on the "rows"-number (documents to be returned) should always > return the same value for "numFound"? > {quote} > Well I found out myself what the problem is (or seems to be) - see: > http://lucene.472066.n3.nabble.com/Changing-value-of-start-parameter-affects-numFound-td2460645.html > http://lucene.472066.n3.nabble.com/numFound-inconsistent-for-different-rows-param-td3997269.html > http://lucene.472066.n3.nabble.com/Solr-v3-5-0-numFound-changes-when-paging-through-results-on-8-shard-cluster-td3990400.html > Until 4.0 this "bug" could be "ignored" because it was ok for a cross-shards > search to consider documents with identical id's as dublets and therefore > only returning/counting one of them. It is still, in 4.0, ok within the same > collection, but across collections identical id's should not be considered > dublicates and should not reduce documents returned/counted. So i believe > this "feature" has now become a bug in 4.0 when it comes to cross-collections > searches. > {quote} > Thanks! > Regards, Steff > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module
[ https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760019#comment-13760019 ] Ajay Bhat commented on LUCENE-2562: --- The TokenStream reset call was needed to display the tokens generated by the Analyzer. I think that's the only change that was required. The main problem for me is the analyzers above are not giving the result, which I've been looking into. I had figured that since PatternAnalyzer was deprecated it would not give the result and so it might be a good idea to remove it from the list of analyzers. But there are also some analyzers that aren't deprecated, like the Snowball Analyzer and QueryAutoStopWordAnalyzer. Also, as per the schedule of my proposal I've done some work on the themes of the Application. I'll contribute another patch for that soon. > Make Luke a Lucene/Solr Module > -- > > Key: LUCENE-2562 > URL: https://issues.apache.org/jira/browse/LUCENE-2562 > Project: Lucene - Core > Issue Type: Task >Reporter: Mark Miller > Labels: gsoc2013 > Attachments: LUCENE-2562.patch, luke1.jpg, luke2.jpg, luke3.jpg, > Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, Luke-ALE-4.png, Luke-ALE-5.png > > > see > "RE: Luke - in need of maintainer": > http://markmail.org/message/m4gsto7giltvrpuf > "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q > I think it would be great if there was a version of Luke that always worked > with trunk - and it would also be great if it was easier to match Luke jars > with Lucene versions. > While I'd like to get GWT Luke into the mix as well, I think the easiest > starting point is to straight port Luke to another UI toolkit before > abstracting out DTO objects that both GWT Luke and Pivot Luke could share. > I've started slowly converting Luke's use of thinlet to Apache Pivot. I > haven't/don't have a lot of time for this at the moment, but I've plugged > away here and there over the past work or two. There is still a *lot* to do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760008#comment-13760008 ] Shalin Shekhar Mangar commented on SOLR-4817: - Just fyi, the copyMinConf, copyMinFullSetup and copySolrHomeToTemp methods throw the following exception with Solrj tests: {quote} junit4] ERROR 0.69s | MultiCoreExampleJettyTest.testDeleteInstanceDir <<< [junit4]> Throwable #1: java.lang.RuntimeException: Cannot find resource: /Users/shalinmangar/work/oss/solr-trunk/solr/build/solr-solrj/test/J0/solr/collection1 [junit4]>at __randomizedtesting.SeedInfo.seed([2AFBC83FDA207BB2:4160F4A68E96AEF0]:0) [junit4]>at org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4.java:1571) [junit4]>at org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCaseJ4.java:1576) [junit4]>at org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1618) [junit4]>at org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1603) [junit4]>at org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDeleteInstanceDir(MultiCoreExampleJettyTest.java:117) {quote} You can reproduce the error above with the patch in SOLR-5023 > Solr should not fall back to the back compat built in solr.xml in SolrCloud > mode. > - > > Key: SOLR-4817 > URL: https://issues.apache.org/jira/browse/SOLR-4817 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Mark Miller >Assignee: Erick Erickson >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, > SOLR-4817.patch, SOLR-4817.patch > > > A hard error is much more useful, and this built in solr.xml is not very good > for solrcloud - with the old style solr.xml with cores in it, you won't have > persistence and with the new style, it's not really ideal either. > I think it makes it easier to debug solr.home to fail on this instead - but > just in solrcloud mode for now due to back compat. We might want to pull the > whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org