[jira] [Commented] (LUCENE-5375) ToChildBlockJoinQuery becomes crazy on wrong subquery
[ https://issues.apache.org/jira/browse/LUCENE-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855493#comment-13855493 ] Dr Oleg Savrasov commented on LUCENE-5375: -- Hi Michael, Many thanks for reviewing the patch. I agree that it's rather Lucene issue and should be covered by appropriate tests. I see your point about adding cost for correct usage. If I enable assertions without adding the validateParents, testAdvanceValidationForToChildBjq always fail, which means that there could be another way for query validation. Let me investigate it. Thank you, Dr Oleg Savrasov, Community Coordinator, Grid Dynamics Search team ToChildBlockJoinQuery becomes crazy on wrong subquery - Key: LUCENE-5375 URL: https://issues.apache.org/jira/browse/LUCENE-5375 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.6 Reporter: Dr Oleg Savrasov Labels: patch Attachments: SOLR-5553.patch Original Estimate: 24h Remaining Estimate: 24h If user supplies wrong subquery to ToParentBlockJoinQuery it reasonably throws IllegalStateException. (http://lucene.apache.org/core/4_0_0/join/org/apache/lucene/search/join/ToParentBlockJoinQuery.html 'The child documents must be orthogonal to the parent documents: the wrapped child query must never return a parent document.'). However ToChildBlockJoinQuery just goes crazy silently. I want to provide simple patch for ToChildBlockJoinQuery with if-throw clause and test. See http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/%3cf415ce3a-ebe5-4d15-adf1-c5ead32a1...@sheffield.ac.uk%3E -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1146 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1146/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseSerialGC All tests passed Build Log: [...truncated 9986 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20131223_11_163.syserr [junit4] JVM J0: stderr (verbatim) [junit4] java(213,0x13cf9) malloc: *** error for object 0x13cf7ee12: pointer being freed was not allocated [junit4] *** set a breakpoint in malloc_error_break to debug [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/java -XX:-UseCompressedOops -XX:+UseSerialGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=4A403B75D3778235 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.disableHdfs=true -Dfile.encoding=UTF-8 -classpath
[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()
[ https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855569#comment-13855569 ] ASF subversion and git services commented on SOLR-5525: --- Commit 1553095 from [~noble.paul] in branch 'dev/trunk' [ https://svn.apache.org/r1553095 ] SOLR-5525 use hasCollection() deprecate ClusterState#getCollectionStates() - Key: SOLR-5525 URL: https://issues.apache.org/jira/browse/SOLR-5525 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Attachments: SOLR-5525.patch, SOLR-5525.patch This is a very expensive call if there are are large no:of collections. Mostly, it is used to check if a collection exists -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()
[ https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855570#comment-13855570 ] ASF subversion and git services commented on SOLR-5525: --- Commit 1553096 from [~noble.paul] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1553096 ] SOLR-5525 use hasCollection() deprecate ClusterState#getCollectionStates() - Key: SOLR-5525 URL: https://issues.apache.org/jira/browse/SOLR-5525 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Attachments: SOLR-5525.patch, SOLR-5525.patch This is a very expensive call if there are are large no:of collections. Mostly, it is used to check if a collection exists -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5473: - Attachment: SOLR-5473.patch updated to latest trunk Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5476) Roles per node
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5476: - Description: In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up was: In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseerblacklist=leaderblacklist=replicanode=node_name If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up if the node is blacklisted for leade/replica , it won't be assigned any new shards Roles per node -- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5476) Overseer Role for nodes
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5476: - Summary: Overseer Role for nodes (was: Roles per node) Overseer Role for nodes --- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5476) Overseer Role for nodes
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5476: - Description: In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up was: In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up Overseer Role for nodes --- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5376: --- Attachment: lucene-demo-server.tgz I'm attaching the current sources (tgz archive)... they are standalone now but to add it into Lucene I think we should put it under lucene/demo or lucene/server or something. It uses custom (Python) build scripts, because I became frustrated with ant; after extracting, {{python3 build.py test}} should run the tests. These are just the sources for the server side of the http://jirasearch.mikemccandless.com app. There are many issues to fix, e.g. cut back to ant (there are some old ant build scripts there), use only one JSON parser (it uses two now), but it does support a number of basic indexing/search APIs: add/update document/s, bulk add/update documents, suggest, search/After, block joins, highlighting, live field values, snapshots, basic index statistics (for diagnostics). It has limited support for plugins, but I'm tempted to remove that before committing. The only plugin it has now is Tika, to crack binary documents into text for indexing. Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5476) Overseer Role for nodes
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-5476: Assignee: Noble Paul Overseer Role for nodes --- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul In a very large cluster the Overseer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { whitelist:[192.168.1.5:8983_solr, 192.168.1.6:8983_solr] } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node would become the Overseer. Later on, if one of the whitelisted nodes are brought up ,it would take over the Overseer role from the current Overseer to become the Overseer of the system -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5476) Overseer Role for nodes
[ https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5476: - Description: In a very large cluster the Overseer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { whitelist:[192.168.1.5:8983_solr, 192.168.1.6:8983_solr] } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node would become the Overseer. Later on, if one of the whitelisted nodes are brought up ,it would take over the Overseer role from the current Overseer to become the Overseer of the system was: In a very large cluster the OverSeer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=node1_namenode=node2_name This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node will be picked up Overseer Role for nodes --- Key: SOLR-5476 URL: https://issues.apache.org/jira/browse/SOLR-5476 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul In a very large cluster the Overseer is likely to be overloaded.If the same node is a serving a few other shards it can lead to OverSeer getting slowed down due to GC pauses , or simply too much of work . If the cluster is really large , it is possible to dedicate high end h/w for OverSeers It works as a new collection admin command command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr This results in the creation of a entry in the /roles.json in ZK which would look like the following { overseer : { whitelist:[192.168.1.5:8983_solr, 192.168.1.6:8983_solr] } } If a node is whitelisted for overseer it gets preference over others when overseer election takes place. If no whitelisted servers are available another random node would become the Overseer. Later on, if one of the whitelisted nodes are brought up ,it would take over the Overseer role from the current Overseer to become the Overseer of the system -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855634#comment-13855634 ] Han Jiang commented on LUCENE-5376: --- +1, it will be great to have an 'active' demo to show the features :) I think we should remove those hardcoded classpaths, e.g. in post.py:30? And will this demo be expected to be the same as jirasearch? Will we need further configuration to get the demo webside working? For example I cannot find search.py in the sourcecodes. Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855664#comment-13855664 ] Michael McCandless commented on LUCENE-5376: Thanks Han. bq. I think we should remove those hardcoded classpaths, e.g. in post.py:30? Good catch, I'll fix that ... that's a minimal example of how to issue commands to the server to create an index and register a few fields, from a Python client. bq. And will this demo be expected to be the same as jirasearch? Will we need further configuration to get the demo webside working? For example I cannot find search.py in the sourcecodes. These sources are just for the server side; I didn't include the jirasearch UI/indexing sources. But I agree it would be useful to have that too, i.e. an example search app/UI that runs against this server. I'll think about how to fold it in ... Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855686#comment-13855686 ] Yonik Seeley commented on LUCENE-5376: -- I think there are plenty of lucene-based search servers already in existence... We shouldn't bloat lucene/solr even further by adding yet another. Something like this belongs as a separate project (collaborate on github with whoever else wants to build/maintain this). Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5567) ZkController getHostAddress duplicates url prefix
[ https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-5567: Attachment: SOLR-5567.patch Trivial patch ZkController getHostAddress duplicates url prefix - Key: SOLR-5567 URL: https://issues.apache.org/jira/browse/SOLR-5567 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Kyle Halliday Priority: Minor Attachments: SOLR-5567.patch Original Estimate: 5m Remaining Estimate: 5m The ZkController getHostAddress method will return a URL with duplicated url prefix if given an input string already including a url prefix. e.g. given the input http://127.0.0.1;, it will return http://http://127.0.0.1; -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5567) ZkController getHostAddress duplicates url prefix
[ https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855735#comment-13855735 ] Mark Miller commented on SOLR-5567: --- We should add a little test too. ZkController getHostAddress duplicates url prefix - Key: SOLR-5567 URL: https://issues.apache.org/jira/browse/SOLR-5567 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Kyle Halliday Priority: Minor Attachments: SOLR-5567.patch Original Estimate: 5m Remaining Estimate: 5m The ZkController getHostAddress method will return a URL with duplicated url prefix if given an input string already including a url prefix. e.g. given the input http://127.0.0.1;, it will return http://http://127.0.0.1; -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5574) CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down.
[ https://issues.apache.org/jira/browse/SOLR-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855786#comment-13855786 ] ASF subversion and git services commented on SOLR-5574: --- Commit 1553157 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1553157 ] SOLR-5574: CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down. CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down. Key: SOLR-5574 URL: https://issues.apache.org/jira/browse/SOLR-5574 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5574.patch The first publish and wait doesn't really serve any purpose. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5574) CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down.
[ https://issues.apache.org/jira/browse/SOLR-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855788#comment-13855788 ] ASF subversion and git services commented on SOLR-5574: --- Commit 1553158 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1553158 ] SOLR-5574: CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down. CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down. Key: SOLR-5574 URL: https://issues.apache.org/jira/browse/SOLR-5574 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5574.patch The first publish and wait doesn't really serve any purpose. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5573) ChaosMonkey should randomly turn off Solr's commit on shutdown option.
[ https://issues.apache.org/jira/browse/SOLR-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855790#comment-13855790 ] ASF subversion and git services commented on SOLR-5573: --- Commit 1553159 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1553159 ] SOLR-5573: ChaosMonkey should randomly turn off Solr's commit on shutdown option. ChaosMonkey should randomly turn off Solr's commit on shutdown option. -- Key: SOLR-5573 URL: https://issues.apache.org/jira/browse/SOLR-5573 Project: Solr Issue Type: Test Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 5.0, 4.7 Because we don't have a great way kill (everything in the same JVM), this is very important for testing tlog replays on startup. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5573) ChaosMonkey should randomly turn off Solr's commit on shutdown option.
[ https://issues.apache.org/jira/browse/SOLR-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855792#comment-13855792 ] ASF subversion and git services commented on SOLR-5573: --- Commit 1553161 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1553161 ] SOLR-5573: ChaosMonkey should randomly turn off Solr's commit on shutdown option. ChaosMonkey should randomly turn off Solr's commit on shutdown option. -- Key: SOLR-5573 URL: https://issues.apache.org/jira/browse/SOLR-5573 Project: Solr Issue Type: Test Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 5.0, 4.7 Because we don't have a great way kill (everything in the same JVM), this is very important for testing tlog replays on startup. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene / Solr 4.6.1
Some 4.6.1 bugs were resolved over the weekend, so I pushed this off. Now it's holidays and what not, so this is probably another week out I'd guess. I'm going to back port a bunch of stuff over the next few days. - Mark On Fri, Dec 20, 2013 at 9:19 AM, Mark Miller markrmil...@gmail.com wrote: Hey, yeah, sorry about the lack of activity on this. Been kind of sick this week. Hope to jump on this soon though. - Mark On Dec 20, 2013, at 9:17 AM, Jan Høydahl jan@cominvent.com wrote: I added a new Version 4.6.1 to the Solr and Lucene JIRA projects. https://issues.apache.org/jira/browse/SOLR-5564 is another low-risk fix candidate for 4.6.1 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 16. des. 2013 kl. 20:02 skrev Joel Bernstein joels...@gmail.com: Sounds great On Mon, Dec 16, 2013 at 1:47 PM, Mark Miller markrmil...@gmail.comwrote: Cool - let’s back port this week and I’ll put up an RC on Saturday? - Mark On Dec 16, 2013, at 1:25 PM, Joel Bernstein joels...@gmail.com wrote: +1 I would like to get out some safe bug fixes to the CollapsingQParserPlugin. On Tue, Dec 3, 2013 at 11:04 AM, Mark Miller markrmil...@gmail.comwrote: I’d be willing to push a 4.6.1 in a couple weeks - I’d like to get a bunch of bug fixes out in a low risk upgrade. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Joel Bernstein Search Engineer at Heliosearch -- Joel Bernstein Search Engineer at Heliosearch -- - Mark
Iterating BinaryDocValues
Hi, I'm looking for a faster way to perform large scale docId - bytesRef lookups for BinaryDocValues. I'm finding that I can't get the performance that I need from the random access seek in the BinaryDocValues interface. I'm wondering if sequentially scanning the docValues would be a faster approach. I have a BitSet of matching docs, so if I sequentially moved through the docValues I could test each one against that bitset. Wondering if that approach would be faster for bulk extracts and how tricky it would be to add an iterator to the BinaryDocValues interface? Thanks, Joel
[jira] [Updated] (SOLR-2960) XPathEntityProcessor does not clear nulls from empty multi-valued fields
[ https://issues.apache.org/jira/browse/SOLR-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2960: - Attachment: SOLR-2960.patch Here is an update of Michael Watts patch for current Trunk and also a unit test. I plan to commit this soon. XPathEntityProcessor does not clear nulls from empty multi-valued fields Key: SOLR-2960 URL: https://issues.apache.org/jira/browse/SOLR-2960 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Michael Watts Assignee: James Dyer Priority: Minor Attachments: SOLR-2960.patch, SOLR-2960.patch I can't confidently say I completeley understand all that these classes so boldy tackle (that is, XPathEntityProcessor and XPathRecordReader) , but there may be someone who does. Nonetheless, I think I've got some or most of this right, and more likely there are more someones like that. So, I won't qualify everything I say with a maybe -- lets this be the refactoring of those. Whenever mapping an XML file into a Solr Index, within the XPathRecordReader, (used by the XPathEntityProcessor within the DataImportHandler), if (A) a field is perceived to be null and is multivalued, it is pushed a value of null (on top of any other values it previously had). Otherwise (B) for multivalued fields, any found value is pushed onto its existing list of values, and the field is marked as found within the frame (a.k.a record). In general, when the end-tag of a record is seen, (C) the XPathRecordReader clears all of the field's values which have been marked as found, as tidiness is a value and they are supposedly no longer useful. However, suppose that for a given record and multivalued field, a value is never found (though it may have been found for other fields in the record), only (A) will have occurred, never will (B) have occurred, the field will never have been marked as found, and thus (C) never will have occurred for the field. So, the field will remain, with its list of nulls. This list of nulls will grow until either the last record or a non-null value is seen. And so, (1) an out-of-memory error may occur, given sufficiently many records and a mortal computer. Moreover, (2), a transformer cannot reliably depend on the number of nulls in the field (and this information cannot be guaranteed to be determined by some other value). I will try to provide more information, if this seems an issue and if there doesn't seem to be an answer. At this point, if I understand the problem correctly, it seems the answer is to 'mark' those null fields, considering 'null' and added value. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
Christine Poerschke created SOLR-5576: - Summary: ZkController.java registerAllCoresAsDown multiple cores logic Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
[ https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-5576: -- Attachment: SOLR-5576.patch Attaching patch to separate publish-as-down and waitForLeaderToSeeDownState into separate for loops. Also replacing return with continue when waitForLeaderToSeeDownState call can be skipped. ZkController.java registerAllCoresAsDown multiple cores logic - Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Bug Reporter: Christine Poerschke Attachments: SOLR-5576.patch The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5577) indexing delay due to zookeeper election
Christine Poerschke created SOLR-5577: - Summary: indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-5577: -- Description: The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} was: The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections,
[jira] [Assigned] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
[ https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5576: - Assignee: Mark Miller ZkController.java registerAllCoresAsDown multiple cores logic - Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Bug Reporter: Christine Poerschke Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5576.patch The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
[ https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5576: -- Fix Version/s: 4.6.1 4.7 5.0 Issue Type: Improvement (was: Bug) ZkController.java registerAllCoresAsDown multiple cores logic - Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Improvement Reporter: Christine Poerschke Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5576.patch The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855835#comment-13855835 ] Mark Miller commented on SOLR-5577: --- Our model should be able to handle this better. Some off the cough remarks: * Our model should be fine with turning off updates only after the connection with zk is lost for a while, rather than the moment it's noticed. * Even if we didn't want to relax the above, we should be able to handle this case better - if the issue is that ZooKeeper is actually unavailable, we won't get new leaders or anything anyway, so no reason to be too concerned about turning off updates. Not sure how easy that is to detect though. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
[ https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855852#comment-13855852 ] ASF subversion and git services commented on SOLR-5576: --- Commit 1553178 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1553178 ] SOLR-5576: Improve concurrency when registering and waiting for all SolrCore's to register a DOWN state. ZkController.java registerAllCoresAsDown multiple cores logic - Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Improvement Reporter: Christine Poerschke Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5576.patch The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic
[ https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855853#comment-13855853 ] ASF subversion and git services commented on SOLR-5576: --- Commit 1553179 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1553179 ] SOLR-5576: Improve concurrency when registering and waiting for all SolrCore's to register a DOWN state. ZkController.java registerAllCoresAsDown multiple cores logic - Key: SOLR-5576 URL: https://issues.apache.org/jira/browse/SOLR-5576 Project: Solr Issue Type: Improvement Reporter: Christine Poerschke Assignee: Mark Miller Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5576.patch The behaviour we saw was that considerable time elapsed between different cores within the same solr instance publishing themselves as down. Separately it appears from the code that some cores would not be published as down if another core returns from the function early because it will be its shard leader (return vs. continue in for loop). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855859#comment-13855859 ] ASF subversion and git services commented on SOLR-1301: --- Commit 1553184 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1553184 ] SOLR-1301: Ignore this test on Windows - there is a problem with Windows paths and Morphlines. Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce. - Key: SOLR-1301 URL: https://issues.apache.org/jira/browse/SOLR-1301 Project: Solr Issue Type: New Feature Reporter: Andrzej Bialecki Assignee: Mark Miller Fix For: 5.0, 4.7 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, log4j-1.2.15.jar This patch contains a contrib module that provides distributed indexing (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is twofold: * provide an API that is familiar to Hadoop developers, i.e. that of OutputFormat * avoid unnecessary export and (de)serialization of data maintained on HDFS. SolrOutputFormat consumes data produced by reduce tasks directly, without storing it in intermediate files. Furthermore, by using an EmbeddedSolrServer, the indexing task is split into as many parts as there are reducers, and the data to be indexed is not sent over the network. Design -- Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, which in turn uses SolrRecordWriter to write this data. SolrRecordWriter instantiates an EmbeddedSolrServer, and it also instantiates an implementation of SolrDocumentConverter, which is responsible for turning Hadoop (key, value) into a SolrInputDocument. This data is then added to a batch, which is periodically submitted to EmbeddedSolrServer. When reduce task completes, and the OutputFormat is closed, SolrRecordWriter calls commit() and optimize() on the EmbeddedSolrServer. The API provides facilities to specify an arbitrary existing solr.home directory, from which the conf/ and lib/ files will be taken. This process results in the creation of as many partial Solr home directories as there were reduce tasks. The output shards are placed in the output directory on the default filesystem (e.g. HDFS). Such part-N directories can be used to run N shard servers. Additionally, users can specify the number of reduce tasks, in particular 1 reduce task, in which case the output will consist of a single shard. An example application is provided that processes large CSV files and uses this API. It uses a custom CSV processing to avoid (de)serialization overhead. This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this issue, you should put it in contrib/hadoop/lib. Note: the development of this patch was sponsored by an anonymous contributor and approved for release under Apache License. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855874#comment-13855874 ] Anshum Gupta commented on SOLR-5577: A shard might update it's state while zk is away (and no one else knows about it) and perhaps we're trying to avoid any such cases by rejecting the updates for as long as zk is unavailable. There might be concerns about consistency if we get any less strict, or so I think. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #541: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/541/ 1 tests failed. FAILED: org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch Error Message: No live SolrServers available to handle this request:[http://127.0.0.1:53250/collection1, http://127.0.0.1:40770/collection1] Stack Trace: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://127.0.0.1:53250/collection1, http://127.0.0.1:40770/collection1] at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:283) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:640) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1325) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:542) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:521) at org.apache.solr.cloud.BasicDistributedZk2Test.brindDownShardIndexSomeDocsAndRecover(BasicDistributedZk2Test.java:305) at org.apache.solr.cloud.BasicDistributedZk2Test.doTest(BasicDistributedZk2Test.java:117) Build Log: [...truncated 38321 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:482: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:176: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77: Java returned: 1 Total time: 99 minutes 57 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855877#comment-13855877 ] Ramkumar Aiyengar commented on SOLR-5577: - The proposal here is to only relax updating the collection for a while when ZK connection is lost. If a shard updates its state, wouldn't the Overseer have to process the state update? That would still continue to fail till ZK comes back up. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5567) ZkController getHostAddress duplicates url prefix
[ https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855898#comment-13855898 ] Mark Miller commented on SOLR-5567: --- Note: Alexey spotted this and provided a fix in SOLR-3854 as well. It was never addressed. ZkController getHostAddress duplicates url prefix - Key: SOLR-5567 URL: https://issues.apache.org/jira/browse/SOLR-5567 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Kyle Halliday Priority: Minor Attachments: SOLR-5567.patch Original Estimate: 5m Remaining Estimate: 5m The ZkController getHostAddress method will return a URL with duplicated url prefix if given an input string already including a url prefix. e.g. given the input http://127.0.0.1;, it will return http://http://127.0.0.1; -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855908#comment-13855908 ] Mark Miller commented on SOLR-5577: --- bq. Solr document updates were slowed down during this time window. That's interesting - slowed down? Not rejected? That is more surprising to me...need to do some code review. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855910#comment-13855910 ] Mark Miller commented on SOLR-5577: --- bq. If a shard updates its state, wouldn't the Overseer have to process the state update? Correct. And there is already some window as well - I'm not sure it matters that the window is a little larger due to the the possibly very small change in probability of it being an issue. We have to think about it carefully, but for the most part, this is just a preventative measure to make sure some node is not going rogue for a long period of time with a cached, stale cluster state and no connection to ZooKeeper for some reason, but perhaps a connection to other nodes. I think we always intended to think about ways to relax it, but when putting things together initially, it was faster/easier to just lock this down as much as possible to start. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855912#comment-13855912 ] Mark Miller commented on SOLR-5577: --- P.S. I don't know that it's the right solution for the issue yet either - just spit balling at this point. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855916#comment-13855916 ] Mark Miller commented on SOLR-5577: --- bq. That's interesting - slowed down? Not rejected? That is more surprising to me...need to do some code review. Okay, bad memory. Looking at the code, this makes sense. Those updates wait up to the session expiration before they would end up erroring...a slowdown makes sense. We have to be able to improve this. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855919#comment-13855919 ] Mark Miller commented on SOLR-5577: --- So I think we already kind of have this window setup for this case - up to the session timeout, which I think makes sense. The problem is in how it's implemented. It shouldn't hold up updates for that long, it should simply only accept them for that long. indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Reporter: Christine Poerschke The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5577) indexing delay due to zookeeper election
[ https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5577: -- Component/s: SolrCloud Fix Version/s: 4.7 5.0 Assignee: Mark Miller indexing delay due to zookeeper election Key: SOLR-5577 URL: https://issues.apache.org/jira/browse/SOLR-5577 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Christine Poerschke Assignee: Mark Miller Fix For: 5.0, 4.7 The behaviour we observed was that a zookeeper election took about 2s plus 1.5s for reading the zoo_data snapshot. During this time solr tried to establish connections to any zookeeper in the ensemble but only succeeded once a leader was elected *and* that leader was done reading the snapshot. Solr document updates were slowed down during this time window. Is this expected to happen during and shortly after elections, that is zookeeper closing existing connections, rejecting new connections and thus stalling solr updates? Other than avoiding zookeeper elections, are there ways of reducing their impact on solr? +zookeeper log extract+ {noformat} 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ... 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941 08:18:56,918 [FileSnap.java:83] Reading snapshot ... ... 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ... 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... (no session established for client) ... 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ... {noformat} +solr log extract+ {noformat} 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect 08:18:55,068 [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:Disconnected type:None path:null path:null type:None 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected ... 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ... 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server sessionid ... likely server has closed socket, closing socket connection and attempting reconnect ... 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ... 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ... 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ... ... 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ... 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ... 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server ... sessionid = ..., negotiated timeout = ... 08:18:58,751 ... [ConnectionManager.java:72] Watcher org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent state:SyncConnected type:None path:null path:null type:None {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input
[ https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Goldfarb updated LUCENE-4072: --- Attachment: LUCENE-4072.patch This patch dodges the use of hasBoundaryAfter, and the tests pass. Note in doTestMode there's a clause that checks if the normalized string has length zero. It seems the nfkc_cf-normalized output of some strings is empty. Examples I found: '\uDB40\uDCD9' '\uDB43\uDF86' '\uFE04' CharFilter that Unicode-normalizes input Key: LUCENE-4072 URL: https://issues.apache.org/jira/browse/LUCENE-4072 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Ippei UKAI Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, ippeiukai-ICUNormalizer2CharFilter-4752cad.zip I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J. The benefit of having this process as CharFilter is that tokenizer can work on normalised text while offset-correction ensuring fast vector highlighter and other offset-dependent features do not break. The implementation is available at following repository: https://github.com/ippeiukai/ICUNormalizer2CharFilter Unfortunately this is my unpaid side-project and cannot spend much time to merge my work to Lucene to make appropriate patch. I'd appreciate it if anyone could give it a go. I'm happy to relicense it to whatever that meets your needs. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856002#comment-13856002 ] Timothy Potter commented on SOLR-4260: -- Found another interesting case that may or may not be valid depending on whether we think HTTP requests between a leader and replica can fail even if the ZooKeeper session on the replica does not drop? Specifically, what I'm seeing is that if an update request between the leader and replica fails, but the replica doesn't lose it's session with ZK, then the replica can get out-of-sync with the leader. In a real network partition, the ZK connection would also likely be lost and the replica would get marked as down. So as long as the HTTP connection timeout between the leader and replica exceeds the ZK client timeout, the replica would probably recover correctly, rendering this test case invalid. So maybe the main question here is whether we think it's possible for HTTP requests between a leader and replica to fail even though the ZooKeeper connection stays alive? Here's the steps I used to reproduce this case (all using revision 1553150 in branch_4x): * STEP 1: Setup a collection named “cloud” containing 1 shard and 2 replicas on hosts: cloud84 (127.0.0.1:8984) and cloud85 (127.0.0.1:8985)* SOLR_TOP=/home/ec2-user/branch_4x/solr $SOLR_TOP/cloud84/cloud-scripts/zkcli.sh -zkhost $ZK_HOST -cmd upconfig -confdir $SOLR_TOP/cloud84/solr/cloud/conf -confname cloud API=http://localhost:8984/solr/admin/collections curl -v $API?action=CREATEname=cloudreplicationFactor=2numShards=1collection.configName=cloud Replica on cloud84 is elected as the initial leader. /clusterstate.json looks like: {cloud:{ shards:{shard1:{ range:8000-7fff, state:active, replicas:{ core_node1:{ state:active, base_url:http://cloud84:8984/solr;, core:cloud_shard1_replica1, node_name:cloud84:8984_solr, leader:true}, core_node2:{ state:active, base_url:http://cloud85:8985/solr;, core:cloud_shard1_replica2, node_name:cloud85:8985_solr, maxShardsPerNode:1, router:{name:compositeId}, replicationFactor:2}} * STEP 2: Simulate network partition* sudo iptables -I INPUT 1 -i lo -p tcp --sport 8985 -j DROP; sudo iptables -I INPUT 2 -i lo -p tcp --dport 8985 -j DROP Various ways to do this, but to keep it simple, I'm just dropping inbound traffic on localhost to port 8985. * STEP 3: Send document with ID “doc1” to leader on cloud84* curl http://localhost:8984/solr/cloud/update; -H 'Content-type:application/xml' \ --data-binary 'adddocfield name=iddoc1/fieldfield name=foo_sbar/field/doc/add' The update request takes some time because the replica is down but ultimately succeeds on the leader. In the logs on the leader, we have (some stack trace lines removed for clarity): 2013-12-23 10:59:33,688 [updateExecutor-1-thread-1] ERROR solr.update.StreamingSolrServers - error org.apache.http.conn.HttpHostConnectException: Connection to http://cloud85:8985 refused at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190) ... at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232) ... Caused by: java.net.ConnectException: Connection timed out ... 2013-12-23 10:59:33,695 [qtp1073932139-16] INFO update.processor.LogUpdateProcessor - [cloud_shard1_replica1] webapp=/solr path=/update params={} {add=[doc1 (1455228778490888192)]} 0 63256 2013-12-23 10:59:33,702 [updateExecutor-1-thread-2] INFO update.processor.DistributedUpdateProcessor - try and ask http://cloud85:8985/solr to recover 2013-12-23 10:59:48,718 [updateExecutor-1-thread-2] ERROR update.processor.DistributedUpdateProcessor - http://cloud85:8985/solr: Could not tell a replica to recover:org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://cloud85:8985/solr at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:507) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) at org.apache.solr.update.processor.DistributedUpdateProcessor$1.run(DistributedUpdateProcessor.java:657) ... Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to cloud85:8985 timed out ... Of course these log messages are expected. The key is that the leader accepted the update and now has one doc with ID doc1 STEP 4: Heal the network partition sudo service iptables restart (undoes the DROP rules we added above) * STEP 5: Send document with ID “doc2” to leader on cloud84* curl http://localhost:8984/solr/cloud/update; -H
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856009#comment-13856009 ] Mark Miller commented on SOLR-4260: --- Yeah, that's currently expected. We don't expect the case where you can talk to ZooKeeper but not your replicas to be common, so we kind of punted on this scenario for the first phase. Some related JIRA issues: SOLR-5482 SOLR-5450 SOLR-5495 I think we should do all that, but the key is really, in this case, we need to pass the order to recover through ZooKeeper to the partitioned off replica. With an eventually consistent model, it can be off for a short time, but it needs to recover in a timely manner. I think this is the right solution because the replica is sure to either get the information to recover from ZooKeeper or lose it's connection to ZooKeeper in which case it will have to recover anyway. Inconsistent numDocs between leader and replica --- Key: SOLR-4260 URL: https://issues.apache.org/jira/browse/SOLR-4260 Project: Solr Issue Type: Bug Components: SolrCloud Environment: 5.0.0.2013.01.04.15.31.51 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 5.0, 4.7 Attachments: 192.168.20.102-replica1.png, 192.168.20.104-replica2.png, clusterstate.png After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards. Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more. Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs. We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856013#comment-13856013 ] Mark Miller commented on SOLR-4260: --- bq. so we kind of punted The other thing to note is that if you restart the shard or that node or the cluster, you should be able to do it without losing any data. It will recover from the leader when everything else is working correctly. Inconsistent numDocs between leader and replica --- Key: SOLR-4260 URL: https://issues.apache.org/jira/browse/SOLR-4260 Project: Solr Issue Type: Bug Components: SolrCloud Environment: 5.0.0.2013.01.04.15.31.51 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 5.0, 4.7 Attachments: 192.168.20.102-replica1.png, 192.168.20.104-replica2.png, clusterstate.png After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards. Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more. Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs. We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5564) hl.maxAlternateFieldLength should apply to original field when fallback field does not exist
[ https://issues.apache.org/jira/browse/SOLR-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-5564: -- Summary: hl.maxAlternateFieldLength should apply to original field when fallback field does not exist (was: hl.maxAlternateFieldLength should apply to original field when fallback is attempted) hl.maxAlternateFieldLength should apply to original field when fallback field does not exist Key: SOLR-5564 URL: https://issues.apache.org/jira/browse/SOLR-5564 Project: Solr Issue Type: Bug Components: highlighter Reporter: Jan Høydahl Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5564.patch For a customer we use {{f.body.hl.alternateField=teaserhl.maxAlternateFieldLength=100}} But some articles do not have the teaser field filled at all, so for queries that do not match the body, we get the full huge body returned in the frontend. If the highlighter has tried to fallback to the alternateField, then hl.maxAlternateFieldLength should always apply, even to text from the original field if alternateFIeld does not exist. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5369) Add an UpperCaseFilter
[ https://issues.apache.org/jira/browse/LUCENE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856049#comment-13856049 ] Ryan McKinley commented on LUCENE-5369: --- Unless I hear objections, I would like to commit in the next few weeks thanks ryan Add an UpperCaseFilter -- Key: LUCENE-5369 URL: https://issues.apache.org/jira/browse/LUCENE-5369 Project: Lucene - Core Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Priority: Minor Attachments: LUCENE-5369-uppercase-filter.patch We should offer a standard way to force upper-case tokens. I understand that lowercase is safer for general search quality because some uppercase characters can represent multiple lowercase ones. However, having upper-case tokens is often nice for faceting (consider normalizing to standard acronyms) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856052#comment-13856052 ] Timothy Potter commented on SOLR-4260: -- Thanks Mark, I suspected my test case was a little cherry picked ... something interesting happened when I also severed the connection between the replica and ZK (ie. same test as above but I also dropped the ZK connection on the replica). 2013-12-23 15:39:57,170 [main-EventThread] INFO common.cloud.ConnectionManager - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 got event WatchedEvent state:Disconnected type:None path:null path:null type:None 2013-12-23 15:39:57,170 [main-EventThread] INFO common.cloud.ConnectionManager - zkClient has disconnected fixed the connection between replica and ZK here 2013-12-23 15:40:45,579 [main-EventThread] INFO common.cloud.ConnectionManager - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 got event WatchedEvent state:Expired type:None path:null path:null type:None 2013-12-23 15:40:45,579 [main-EventThread] INFO common.cloud.ConnectionManager - Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... 2013-12-23 15:40:45,580 [main-EventThread] INFO common.cloud.DefaultConnectionStrategy - Connection expired - starting a new one... 2013-12-23 15:40:45,586 [main-EventThread] INFO common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper 2013-12-23 15:40:45,595 [main-EventThread] INFO common.cloud.ConnectionManager - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None 2013-12-23 15:40:45,595 [main-EventThread] INFO common.cloud.ConnectionManager - Client is connected to ZooKeeper 2013-12-23 15:40:45,595 [main-EventThread] INFO common.cloud.ConnectionManager - Connection with ZooKeeper reestablished. 2013-12-23 15:40:45,596 [main-EventThread] WARN solr.cloud.RecoveryStrategy - Stopping recovery for zkNodeName=core_node3core=cloud_shard1_replica3 2013-12-23 15:40:45,597 [main-EventThread] INFO solr.cloud.ZkController - publishing core=cloud_shard1_replica3 state=down 2013-12-23 15:40:45,597 [main-EventThread] INFO solr.cloud.ZkController - numShards not found on descriptor - reading it from system property 2013-12-23 15:40:45,905 [qtp2124890785-14] INFO handler.admin.CoreAdminHandler - It has been requested that we recover 2013-12-23 15:40:45,906 [qtp2124890785-14] INFO solr.servlet.SolrDispatchFilter - [admin] webapp=null path=/admin/cores params={action=REQUESTRECOVERYcore=cloud_shard1_replica3wt=javabinversion=2} status=0 QTime=2 2013-12-23 15:40:45,909 [Thread-17] INFO solr.cloud.ZkController - publishing core=cloud_shard1_replica3 state=recovering 2013-12-23 15:40:45,909 [Thread-17] INFO solr.cloud.ZkController - numShards not found on descriptor - reading it from system property 2013-12-23 15:40:45,920 [Thread-17] INFO solr.update.DefaultSolrCoreState - Running recovery - first canceling any ongoing recovery 2013-12-23 15:40:45,921 [RecoveryThread] INFO solr.cloud.RecoveryStrategy - Starting recovery process. core=cloud_shard1_replica3 recoveringAfterStartup=false 2013-12-23 15:40:45,924 [RecoveryThread] INFO solr.cloud.ZkController - publishing core=cloud_shard1_replica3 state=recovering 2013-12-23 15:40:45,924 [RecoveryThread] INFO solr.cloud.ZkController - numShards not found on descriptor - reading it from system property 2013-12-23 15:40:48,613 [qtp2124890785-15] INFO solr.core.SolrCore - [cloud_shard1_replica3] webapp=/solr path=/select params={q=foo_s:bardistrib=falsewt=jsonrows=0} hits=0 status=0 QTime=1 2013-12-23 15:42:42,770 [qtp2124890785-13] INFO solr.core.SolrCore - [cloud_shard1_replica3] webapp=/solr path=/select params={q=foo_s:bardistrib=falsewt=jsonrows=0} hits=0 status=0 QTime=1 2013-12-23 15:42:45,650 [main-EventThread] ERROR solr.cloud.ZkController - There was a problem making a request to the leader:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was asked to wait on state down for cloud86:8986_solr but I still do not see the requested state. I see state: recovering live:false at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) at org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1434) at org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:347)
[jira] [Commented] (SOLR-5552) Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.
[ https://issues.apache.org/jira/browse/SOLR-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856062#comment-13856062 ] Timothy Potter commented on SOLR-5552: -- Glad it was helpful even though my patch was crap ;-) I'll test against trunk in my env as well. Thanks. Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered. --- Key: SOLR-5552 URL: https://issues.apache.org/jira/browse/SOLR-5552 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Priority: Critical Labels: leader, recovery Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5552.patch, SOLR-5552.patch One particular issue that leads to out-of-sync shards, related to SOLR-4260 Here's what I know so far, which admittedly isn't much: As cloud85 (replica before it crashed) is initializing, it enters the wait process in ShardLeaderElectionContext#waitForReplicasToComeUp; this is expected and a good thing. Some short amount of time in the future, cloud84 (leader before it crashed) begins initializing and gets to a point where it adds itself as a possible leader for the shard (by creating a znode under /collections/cloud/leaders_elect/shard1/election), which leads to cloud85 being able to return from waitForReplicasToComeUp and try to determine who should be the leader. cloud85 then tries to run the SyncStrategy, which can never work because in this scenario the Jetty HTTP listener is not active yet on either node, so all replication work that uses HTTP requests fails on both nodes ... PeerSync treats these failures as indicators that the other replicas in the shard are unavailable (or whatever) and assumes success. Here's the log message: 2013-12-11 11:43:25,936 [coreLoadExecutor-3-thread-1] WARN solr.update.PeerSync - PeerSync: core=cloud_shard1_replica1 url=http://cloud85:8985/solr couldn't connect to http://cloud84:8984/solr/cloud_shard1_replica2/, counting as success The Jetty HTTP listener doesn't start accepting connections until long after this process has completed and already selected the wrong leader. From what I can see, we seem to have a leader recovery process that is based partly on HTTP requests to the other nodes, but the HTTP listener on those nodes isn't active yet. We need a leader recovery process that doesn't rely on HTTP requests. Perhaps, leader recovery for a shard w/o a current leader may need to work differently than leader election in a shard that has replicas that can respond to HTTP requests? All of what I'm seeing makes perfect sense for leader election when there are active replicas and the current leader fails. All this aside, I'm not asserting that this is the only cause for the out-of-sync issues reported in this ticket, but it definitely seems like it could happen in a real cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856078#comment-13856078 ] Mark Miller commented on SOLR-4260: --- That's interesting. The logging makes it look like it's not creating it's new ephemeral live node for some reason...or the leader is not getting an updated view of the live node... Inconsistent numDocs between leader and replica --- Key: SOLR-4260 URL: https://issues.apache.org/jira/browse/SOLR-4260 Project: Solr Issue Type: Bug Components: SolrCloud Environment: 5.0.0.2013.01.04.15.31.51 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 5.0, 4.7 Attachments: 192.168.20.102-replica1.png, 192.168.20.104-replica2.png, clusterstate.png After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards. Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more. Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs. We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua Hartman updated LUCENE-5372: --- Attachment: 5372-lucene5339.patch 5372-v2.patch Uploaded versions of the patch for both trunk and lucene5339 branches for replacing StringBuffer - StringBuilder. Due to API constraints it is not possible to do so in all cases. Mike was also correct. Code related to the specific O(n^2) issue this JIRA was started for no longer exists on lucene5339. IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372-lucene5339.patch, 5372-v2.patch, 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1059: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1059/ 1 tests failed. FAILED: org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch Error Message: No live SolrServers available to handle this request:[http://127.0.0.1:15369/ky_kg/collection1, http://127.0.0.1:12475/ky_kg/collection1] Stack Trace: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://127.0.0.1:15369/ky_kg/collection1, http://127.0.0.1:12475/ky_kg/collection1] at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199) at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:283) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:640) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1325) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:542) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:521) at org.apache.solr.cloud.BasicDistributedZk2Test.brindDownShardIndexSomeDocsAndRecover(BasicDistributedZk2Test.java:305) at org.apache.solr.cloud.BasicDistributedZk2Test.doTest(BasicDistributedZk2Test.java:117) Build Log: [...truncated 52608 lines...] [mvn] [INFO] - [mvn] [INFO] - [mvn] [ERROR] COMPILATION ERROR : [mvn] [INFO] - [...truncated 279 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:476: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:176: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77: Java returned: 1 Total time: 111 minutes 23 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned
Littlestar created LUCENE-5377: -- Summary: Lucene mixed index segments cause segment info file(.si) unversioned Key: LUCENE-5377 URL: https://issues.apache.org/jira/browse/LUCENE-5377 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.6 Environment: windows/linux Reporter: Littlestar my old facet index create by Lucene version=4.2 use indexChecker ok. now I upgrade to Lucene 4.6 and put some new records to index. then reopen index, some files in indexdir missing no .si files. I debug into it, new version format of segments.gen(segments_N) record bad segments info. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned
[ https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856110#comment-13856110 ] Littlestar commented on LUCENE-5377: Lucene 4.5/4.5.1 is ok. Lucene mixed index segments cause segment info file(.si) unversioned Key: LUCENE-5377 URL: https://issues.apache.org/jira/browse/LUCENE-5377 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.6 Environment: windows/linux Reporter: Littlestar my old facet index create by Lucene version=4.2 use indexChecker ok. now I upgrade to Lucene 4.6 and put some new records to index. then reopen index, some files in indexdir missing no .si files. I debug into it, new version format of segments.gen(segments_N) record bad segments info. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned
[ https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856110#comment-13856110 ] Littlestar edited comment on LUCENE-5377 at 12/24/13 4:07 AM: -- Lucene 4.5/4.5.1 is ok. but failed in 4.6.0 was (Author: cnstar9988): Lucene 4.5/4.5.1 is ok. Lucene mixed index segments cause segment info file(.si) unversioned Key: LUCENE-5377 URL: https://issues.apache.org/jira/browse/LUCENE-5377 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.6 Environment: windows/linux Reporter: Littlestar my old facet index create by Lucene version=4.2 use indexChecker ok. now I upgrade to Lucene 4.6 and put some new records to index. then reopen index, some files in indexdir missing no .si files. I debug into it, new version format of segments.gen(segments_N) record bad segments info. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5377) Lucene mixed version segments cause segment info file(.si) wrong
[ https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Littlestar updated LUCENE-5377: --- Summary: Lucene mixed version segments cause segment info file(.si) wrong (was: Lucene mixed index segments cause segment info file(.si) unversioned) Lucene mixed version segments cause segment info file(.si) wrong Key: LUCENE-5377 URL: https://issues.apache.org/jira/browse/LUCENE-5377 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.6 Environment: windows/linux Reporter: Littlestar my old facet index create by Lucene version=4.2 use indexChecker ok. now I upgrade to Lucene 4.6 and put some new records to index. then reopen index, some files in indexdir missing no .si files. I debug into it, new version format of segments.gen(segments_N) record bad segments info. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input
[ https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4072: Attachment: LUCENE-4072.patch Thanks so much for attacking this David: I think that 0-length all default ignorables case makes sense (where it creates an empty string), because in that case there won't be a single token at all (MockTokenizer is not a perfect emulator of KeywordTokenizer here). I think this patch is close, but when running the test a few hundred times I hit a failure (see my added testCuriousString, which fails). I think this one is a bug in the logic. Motivated by this fail, I tried to beef up tests in general: * fixed my typo where testNFD wasnt actually testing NFD * test strings 20 characters, since this filter has an internal 128-char buffer. The latter seems to expose a lot of bugs, I assume due to the internal buffering. I haven't yet looked into this. But it seems there are correctness issues for documents 128 chars (as well as what I believe is a separate bug seen by testCuriousString, which I think is some bug in the logic related to ignorables). CharFilter that Unicode-normalizes input Key: LUCENE-4072 URL: https://issues.apache.org/jira/browse/LUCENE-4072 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Ippei UKAI Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, ippeiukai-ICUNormalizer2CharFilter-4752cad.zip I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J. The benefit of having this process as CharFilter is that tokenizer can work on normalised text while offset-correction ensuring fast vector highlighter and other offset-dependent features do not break. The implementation is available at following repository: https://github.com/ippeiukai/ICUNormalizer2CharFilter Unfortunately this is my unpaid side-project and cannot spend much time to merge my work to Lucene to make appropriate patch. I'd appreciate it if anyone could give it a go. I'm happy to relicense it to whatever that meets your needs. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4072) CharFilter that Unicode-normalizes input
[ https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856140#comment-13856140 ] Robert Muir commented on LUCENE-4072: - ok as for the testCuriousString bug, I enabled verbose (ant test -Dtestcase=TestICUNormalizer2CharFilter -Dtestmethod=testCuriousString -Dtests.verbose=true) and it seems to always fail when given a spoon-fed Reader. So Ill dig into this one, I think it involves how this charfilter consumes the reader api. CharFilter that Unicode-normalizes input Key: LUCENE-4072 URL: https://issues.apache.org/jira/browse/LUCENE-4072 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Ippei UKAI Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, ippeiukai-ICUNormalizer2CharFilter-4752cad.zip I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J. The benefit of having this process as CharFilter is that tokenizer can work on normalised text while offset-correction ensuring fast vector highlighter and other offset-dependent features do not break. The implementation is available at following repository: https://github.com/ippeiukai/ICUNormalizer2CharFilter Unfortunately this is my unpaid side-project and cannot spend much time to merge my work to Lucene to make appropriate patch. I'd appreciate it if anyone could give it a go. I'm happy to relicense it to whatever that meets your needs. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4072) CharFilter that Unicode-normalizes input
[ https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856149#comment-13856149 ] Robert Muir commented on LUCENE-4072: - One thing that certainly looks like a bug is this: The input-processing side looks like this in pseudocode: {code} while (read() some char[]s) { normalize(char[]s) // (quick check/hasBoundary/etc) } {code} But read() works at char level, and these normalization apis want ints. So I think readInputToBuffer() needs to keep reading, if possible, to ensure it fully consumes whole codepoints before returning. I added a little hack locally, but it didnt seem to clean up the test fails, so I think there are other bugs too, or I'm missing something? {code} private int readInputToBuffer() throws IOException { final int len = input.read(tmpBuffer); if (len == -1) { inputFinished = true; return 0; } inputBuffer.append(tmpBuffer, 0, len); // nocommit: just a hack // if buffer ends on high surrogate, keep reading before processing if (len 0 Character.isHighSurrogate(tmpBuffer[len-1])) { return len + readInputToBuffer(); } // end hack return len; } {code} CharFilter that Unicode-normalizes input Key: LUCENE-4072 URL: https://issues.apache.org/jira/browse/LUCENE-4072 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Ippei UKAI Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, ippeiukai-ICUNormalizer2CharFilter-4752cad.zip I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J. The benefit of having this process as CharFilter is that tokenizer can work on normalised text while offset-correction ensuring fast vector highlighter and other offset-dependent features do not break. The implementation is available at following repository: https://github.com/ippeiukai/ICUNormalizer2CharFilter Unfortunately this is my unpaid side-project and cannot spend much time to merge my work to Lucene to make appropriate patch. I'd appreciate it if anyone could give it a go. I'm happy to relicense it to whatever that meets your needs. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5552) Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.
[ https://issues.apache.org/jira/browse/SOLR-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856159#comment-13856159 ] Timothy Potter commented on SOLR-5552: -- Ran my manual test process on trunk and could not reproduce the out-of-sync issue! From the logs, the recovery process definitely starts after the HTTP listener is up. Looking good on trunk. Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered. --- Key: SOLR-5552 URL: https://issues.apache.org/jira/browse/SOLR-5552 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Priority: Critical Labels: leader, recovery Fix For: 5.0, 4.7, 4.6.1 Attachments: SOLR-5552.patch, SOLR-5552.patch One particular issue that leads to out-of-sync shards, related to SOLR-4260 Here's what I know so far, which admittedly isn't much: As cloud85 (replica before it crashed) is initializing, it enters the wait process in ShardLeaderElectionContext#waitForReplicasToComeUp; this is expected and a good thing. Some short amount of time in the future, cloud84 (leader before it crashed) begins initializing and gets to a point where it adds itself as a possible leader for the shard (by creating a znode under /collections/cloud/leaders_elect/shard1/election), which leads to cloud85 being able to return from waitForReplicasToComeUp and try to determine who should be the leader. cloud85 then tries to run the SyncStrategy, which can never work because in this scenario the Jetty HTTP listener is not active yet on either node, so all replication work that uses HTTP requests fails on both nodes ... PeerSync treats these failures as indicators that the other replicas in the shard are unavailable (or whatever) and assumes success. Here's the log message: 2013-12-11 11:43:25,936 [coreLoadExecutor-3-thread-1] WARN solr.update.PeerSync - PeerSync: core=cloud_shard1_replica1 url=http://cloud85:8985/solr couldn't connect to http://cloud84:8984/solr/cloud_shard1_replica2/, counting as success The Jetty HTTP listener doesn't start accepting connections until long after this process has completed and already selected the wrong leader. From what I can see, we seem to have a leader recovery process that is based partly on HTTP requests to the other nodes, but the HTTP listener on those nodes isn't active yet. We need a leader recovery process that doesn't rely on HTTP requests. Perhaps, leader recovery for a shard w/o a current leader may need to work differently than leader election in a shard that has replicas that can respond to HTTP requests? All of what I'm seeing makes perfect sense for leader election when there are active replicas and the current leader fails. All this aside, I'm not asserting that this is the only cause for the out-of-sync issues reported in this ticket, but it definitely seems like it could happen in a real cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5367) NoSuchElementException occurs when org.apache.lucene.facet.index.FacetFields is used.
[ https://issues.apache.org/jira/browse/LUCENE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856187#comment-13856187 ] Shai Erera commented on LUCENE-5367: Lucien, do you have a testcase which reproduces the error? If not, I'll close the issue. NoSuchElementException occurs when org.apache.lucene.facet.index.FacetFields is used. - Key: LUCENE-5367 URL: https://issues.apache.org/jira/browse/LUCENE-5367 Project: Lucene - Core Issue Type: Bug Components: modules/facet Affects Versions: 4.2.1, 4.6 Reporter: Lucien Pereira Hi, When I use the API as below : {code} ListCategoryPath categories = Collections.CategoryPathsingletonList(new CategoryPath(path.toArray(new String[path.size()]))); FacetFields facetFields = new FacetFields(taxonomyWriter); facetFields.addFields(document, categories); taxonomyWriter.commit(); {code} An exception occurs : {quote} java.util.NoSuchElementException at java.util.Collections$1.next(Collections.java:3302) at org.apache.lucene.facet.index.DrillDownStream.reset(DrillDownStream.java:78) at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:97) at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1171) {quote} Seems likes this is due to multiple calls to org.apache.lucene.facet.index.DrillDownStream#reset which invoques #next() on an 'used' iterator. Regards, Lucien -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856194#comment-13856194 ] rashi gandhi commented on LUCENE-2899: -- Hi, I have successfully applied LUCENE-2899.patch to SOLR-4.5.1 and its working properly. Now , my requirement is to combine OpenNLP with jwnl. Is it possible to combine OpenNLP with jwnl and what are the changes required in SOLR schema.xml for the same? Kindly provide some pointers to move ahead. Thanks in Advance Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 4.7 Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org