[jira] [Commented] (LUCENE-5375) ToChildBlockJoinQuery becomes crazy on wrong subquery

2013-12-23 Thread Dr Oleg Savrasov (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855493#comment-13855493
 ] 

Dr Oleg Savrasov commented on LUCENE-5375:
--

Hi Michael,

Many thanks for reviewing the patch. I agree that it's rather Lucene issue and 
should be covered by appropriate tests.

I see your point about adding cost for correct usage. If I enable assertions 
without adding the validateParents, testAdvanceValidationForToChildBjq always 
fail, which means that there could be another way for query validation. Let me 
investigate it.

Thank you,

Dr Oleg Savrasov,
Community Coordinator,
Grid Dynamics Search team

 ToChildBlockJoinQuery becomes crazy on wrong subquery
 -

 Key: LUCENE-5375
 URL: https://issues.apache.org/jira/browse/LUCENE-5375
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/join
Affects Versions: 4.6
Reporter: Dr Oleg Savrasov
  Labels: patch
 Attachments: SOLR-5553.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 If user supplies wrong subquery to ToParentBlockJoinQuery it reasonably 
 throws IllegalStateException. 
 (http://lucene.apache.org/core/4_0_0/join/org/apache/lucene/search/join/ToParentBlockJoinQuery.html
  'The child documents must be orthogonal to the parent documents: the wrapped 
 child query must never return a parent document.'). However 
 ToChildBlockJoinQuery just goes crazy silently. I want to provide simple 
 patch for ToChildBlockJoinQuery with if-throw clause and test.
 See 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201311.mbox/%3cf415ce3a-ebe5-4d15-adf1-c5ead32a1...@sheffield.ac.uk%3E



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1146 - Failure!

2013-12-23 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1146/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 9986 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20131223_11_163.syserr
   [junit4]  JVM J0: stderr (verbatim) 
   [junit4] java(213,0x13cf9) malloc: *** error for object 0x13cf7ee12: 
pointer being freed was not allocated
   [junit4] *** set a breakpoint in malloc_error_break to debug
   [junit4]  JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/java 
-XX:-UseCompressedOops -XX:+UseSerialGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=4A403B75D3778235 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.disableHdfs=true -Dfile.encoding=UTF-8 -classpath 

[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855569#comment-13855569
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1553095 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1553095 ]

SOLR-5525 use hasCollection()

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5525) deprecate ClusterState#getCollectionStates()

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855570#comment-13855570
 ] 

ASF subversion and git services commented on SOLR-5525:
---

Commit 1553096 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553096 ]

SOLR-5525 use hasCollection()

 deprecate ClusterState#getCollectionStates() 
 -

 Key: SOLR-5525
 URL: https://issues.apache.org/jira/browse/SOLR-5525
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5525.patch, SOLR-5525.patch


 This is a very expensive call if there are are large no:of collections. 
 Mostly, it is used to check if a collection exists



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5473) Make one state.json per collection

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5473:
-

Attachment: SOLR-5473.patch

updated to latest trunk

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Attachments: SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5476) Roles per node

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5476:
-

Description: 
In a very large cluster the OverSeer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseernode=node1_namenode=node2_name

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node will be picked up



  was:
In a very large cluster the OverSeer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseerblacklist=leaderblacklist=replicanode=node_name

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node will be picked up

if the node is blacklisted for leade/replica , it won't be assigned any new 
shards


 Roles per node
 --

 Key: SOLR-5476
 URL: https://issues.apache.org/jira/browse/SOLR-5476
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul

 In a very large cluster the OverSeer is likely to be overloaded.If the same 
 node is a serving a few other shards it can lead to OverSeer getting slowed 
 down due to GC pauses , or simply too much of work  . If the cluster is 
 really large , it is possible to dedicate high end h/w for OverSeers
 It works as a new collection admin command
 command=assignRolewhitelist=overseernode=node1_namenode=node2_name
 If a node is whitelisted for overseer it gets preference over others when 
 overseer election takes place. If no whitelisted servers are available 
 another random node will be picked up



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5476) Overseer Role for nodes

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5476:
-

Summary: Overseer Role for nodes  (was: Roles per node)

 Overseer Role for nodes
 ---

 Key: SOLR-5476
 URL: https://issues.apache.org/jira/browse/SOLR-5476
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul

 In a very large cluster the OverSeer is likely to be overloaded.If the same 
 node is a serving a few other shards it can lead to OverSeer getting slowed 
 down due to GC pauses , or simply too much of work  . If the cluster is 
 really large , it is possible to dedicate high end h/w for OverSeers
 It works as a new collection admin command
 command=assignRolewhitelist=overseernode=node1_namenode=node2_name
 If a node is whitelisted for overseer it gets preference over others when 
 overseer election takes place. If no whitelisted servers are available 
 another random node will be picked up



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5476) Overseer Role for nodes

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5476:
-

Description: 
In a very large cluster the OverSeer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseernode=node1_namenode=node2_name

This results in the creation of a entry in the /roles.json in ZK which would 
look like the following


{
overseer : {

  }

}

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node will be picked up




  was:
In a very large cluster the OverSeer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseernode=node1_namenode=node2_name

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node will be picked up




 Overseer Role for nodes
 ---

 Key: SOLR-5476
 URL: https://issues.apache.org/jira/browse/SOLR-5476
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul

 In a very large cluster the OverSeer is likely to be overloaded.If the same 
 node is a serving a few other shards it can lead to OverSeer getting slowed 
 down due to GC pauses , or simply too much of work  . If the cluster is 
 really large , it is possible to dedicate high end h/w for OverSeers
 It works as a new collection admin command
 command=assignRolewhitelist=overseernode=node1_namenode=node2_name
 This results in the creation of a entry in the /roles.json in ZK which would 
 look like the following
 {
 overseer : {
   }
 }
 If a node is whitelisted for overseer it gets preference over others when 
 overseer election takes place. If no whitelisted servers are available 
 another random node will be picked up



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5376) Add a demo search server

2013-12-23 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5376:
---

Attachment: lucene-demo-server.tgz

I'm attaching the current sources (tgz archive)... they are standalone now but 
to add it into Lucene I think we should put it under lucene/demo or 
lucene/server or something.

It uses custom (Python) build scripts, because I became frustrated with ant; 
after extracting, {{python3 build.py test}} should run the tests.

These are just the sources for the server side of the 
http://jirasearch.mikemccandless.com app.

There are many issues to fix, e.g. cut back to ant (there are some old ant 
build scripts there), use only one JSON parser (it uses two now), but it does 
support a number of basic indexing/search APIs: add/update document/s, bulk 
add/update documents, suggest, search/After, block joins, highlighting, live 
field values, snapshots, basic index statistics (for diagnostics).

It has limited support for plugins, but I'm tempted to remove that before 
committing.  The only plugin it has now is Tika, to crack binary documents into 
text for indexing.

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5476) Overseer Role for nodes

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-5476:


Assignee: Noble Paul

 Overseer Role for nodes
 ---

 Key: SOLR-5476
 URL: https://issues.apache.org/jira/browse/SOLR-5476
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 In a very large cluster the Overseer is likely to be overloaded.If the same 
 node is a serving a few other shards it can lead to OverSeer getting slowed 
 down due to GC pauses , or simply too much of work  . If the cluster is 
 really large , it is possible to dedicate high end h/w for OverSeers
 It works as a new collection admin command
 command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr
 This results in the creation of a entry in the /roles.json in ZK which would 
 look like the following
 {
 overseer : {
   whitelist:[192.168.1.5:8983_solr, 
 192.168.1.6:8983_solr]
   }
 }
 If a node is whitelisted for overseer it gets preference over others when 
 overseer election takes place. If no whitelisted servers are available 
 another random node would become the Overseer.
 Later on, if one of the whitelisted nodes are brought up ,it would take over 
 the Overseer role from the current Overseer to become the Overseer of the 
 system



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5476) Overseer Role for nodes

2013-12-23 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5476:
-

Description: 
In a very large cluster the Overseer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr

This results in the creation of a entry in the /roles.json in ZK which would 
look like the following


{
overseer : {
  whitelist:[192.168.1.5:8983_solr, 192.168.1.6:8983_solr]
  }

}

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node would become the Overseer.

Later on, if one of the whitelisted nodes are brought up ,it would take over 
the Overseer role from the current Overseer to become the Overseer of the system



  was:
In a very large cluster the OverSeer is likely to be overloaded.If the same 
node is a serving a few other shards it can lead to OverSeer getting slowed 
down due to GC pauses , or simply too much of work  . If the cluster is really 
large , it is possible to dedicate high end h/w for OverSeers

It works as a new collection admin command

command=assignRolewhitelist=overseernode=node1_namenode=node2_name

This results in the creation of a entry in the /roles.json in ZK which would 
look like the following


{
overseer : {

  }

}

If a node is whitelisted for overseer it gets preference over others when 
overseer election takes place. If no whitelisted servers are available another 
random node will be picked up





 Overseer Role for nodes
 ---

 Key: SOLR-5476
 URL: https://issues.apache.org/jira/browse/SOLR-5476
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul

 In a very large cluster the Overseer is likely to be overloaded.If the same 
 node is a serving a few other shards it can lead to OverSeer getting slowed 
 down due to GC pauses , or simply too much of work  . If the cluster is 
 really large , it is possible to dedicate high end h/w for OverSeers
 It works as a new collection admin command
 command=assignRolewhitelist=overseernode=192.168.1.5:8983_solrnode=192.168.1.6:8983_solr
 This results in the creation of a entry in the /roles.json in ZK which would 
 look like the following
 {
 overseer : {
   whitelist:[192.168.1.5:8983_solr, 
 192.168.1.6:8983_solr]
   }
 }
 If a node is whitelisted for overseer it gets preference over others when 
 overseer election takes place. If no whitelisted servers are available 
 another random node would become the Overseer.
 Later on, if one of the whitelisted nodes are brought up ,it would take over 
 the Overseer role from the current Overseer to become the Overseer of the 
 system



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2013-12-23 Thread Han Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855634#comment-13855634
 ] 

Han Jiang commented on LUCENE-5376:
---

+1, it will be great to have an 'active' demo to show the features :)

I think we should remove those hardcoded classpaths, e.g. in post.py:30?

And will this demo be expected to be the same as jirasearch? Will we need 
further configuration to get the demo webside working? For example I cannot 
find search.py in the sourcecodes.


 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2013-12-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855664#comment-13855664
 ] 

Michael McCandless commented on LUCENE-5376:


Thanks Han.

bq. I think we should remove those hardcoded classpaths, e.g. in post.py:30?

Good catch, I'll fix that ... that's a minimal example of how to issue commands 
to the server to create an index and register a few fields, from a Python 
client.

bq. And will this demo be expected to be the same as jirasearch? Will we need 
further configuration to get the demo webside working? For example I cannot 
find search.py in the sourcecodes.

These sources are just for the server side; I didn't include the jirasearch 
UI/indexing sources.  But I agree it would be useful to have that too, i.e. an 
example search app/UI that runs against this server.  I'll think about how to 
fold it in ...

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2013-12-23 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855686#comment-13855686
 ] 

Yonik Seeley commented on LUCENE-5376:
--

I think there are plenty of lucene-based search servers already in existence... 
We shouldn't bloat lucene/solr even further by adding yet another.  Something 
like this belongs as a separate project (collaborate on github with whoever 
else wants to build/maintain this).

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5567) ZkController getHostAddress duplicates url prefix

2013-12-23 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5567:


Attachment: SOLR-5567.patch

Trivial patch

 ZkController getHostAddress duplicates url prefix
 -

 Key: SOLR-5567
 URL: https://issues.apache.org/jira/browse/SOLR-5567
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6
Reporter: Kyle Halliday
Priority: Minor
 Attachments: SOLR-5567.patch

   Original Estimate: 5m
  Remaining Estimate: 5m

 The ZkController getHostAddress method will return a URL with duplicated url 
 prefix if given an input string already including a url prefix.
 e.g. given the input http://127.0.0.1;, it will return 
 http://http://127.0.0.1;



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5567) ZkController getHostAddress duplicates url prefix

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855735#comment-13855735
 ] 

Mark Miller commented on SOLR-5567:
---

We should add a little test too.

 ZkController getHostAddress duplicates url prefix
 -

 Key: SOLR-5567
 URL: https://issues.apache.org/jira/browse/SOLR-5567
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6
Reporter: Kyle Halliday
Priority: Minor
 Attachments: SOLR-5567.patch

   Original Estimate: 5m
  Remaining Estimate: 5m

 The ZkController getHostAddress method will return a URL with duplicated url 
 prefix if given an input string already including a url prefix.
 e.g. given the input http://127.0.0.1;, it will return 
 http://http://127.0.0.1;



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5574) CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down.

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855786#comment-13855786
 ] 

ASF subversion and git services commented on SOLR-5574:
---

Commit 1553157 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553157 ]

SOLR-5574: CoreContainer shutdown publishes all nodes as down and waits to see 
that and then again publishes all nodes as down.

 CoreContainer shutdown publishes all nodes as down and waits to see that and 
 then again publishes all nodes as down.
 

 Key: SOLR-5574
 URL: https://issues.apache.org/jira/browse/SOLR-5574
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5574.patch


 The first publish and wait doesn't really serve any purpose.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5574) CoreContainer shutdown publishes all nodes as down and waits to see that and then again publishes all nodes as down.

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855788#comment-13855788
 ] 

ASF subversion and git services commented on SOLR-5574:
---

Commit 1553158 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553158 ]

SOLR-5574: CoreContainer shutdown publishes all nodes as down and waits to see 
that and then again publishes all nodes as down.

 CoreContainer shutdown publishes all nodes as down and waits to see that and 
 then again publishes all nodes as down.
 

 Key: SOLR-5574
 URL: https://issues.apache.org/jira/browse/SOLR-5574
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5574.patch


 The first publish and wait doesn't really serve any purpose.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5573) ChaosMonkey should randomly turn off Solr's commit on shutdown option.

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855790#comment-13855790
 ] 

ASF subversion and git services commented on SOLR-5573:
---

Commit 1553159 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553159 ]

SOLR-5573: ChaosMonkey should randomly turn off Solr's commit on shutdown 
option.

 ChaosMonkey should randomly turn off Solr's commit on shutdown option.
 --

 Key: SOLR-5573
 URL: https://issues.apache.org/jira/browse/SOLR-5573
 Project: Solr
  Issue Type: Test
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 Because we don't have a great way kill (everything in the same JVM), this is 
 very important for testing tlog replays on startup.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5573) ChaosMonkey should randomly turn off Solr's commit on shutdown option.

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855792#comment-13855792
 ] 

ASF subversion and git services commented on SOLR-5573:
---

Commit 1553161 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553161 ]

SOLR-5573: ChaosMonkey should randomly turn off Solr's commit on shutdown 
option.

 ChaosMonkey should randomly turn off Solr's commit on shutdown option.
 --

 Key: SOLR-5573
 URL: https://issues.apache.org/jira/browse/SOLR-5573
 Project: Solr
  Issue Type: Test
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 Because we don't have a great way kill (everything in the same JVM), this is 
 very important for testing tlog replays on startup.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene / Solr 4.6.1

2013-12-23 Thread Mark Miller
Some 4.6.1 bugs were resolved over the weekend, so I pushed this off. Now
it's holidays and what not, so this is probably another week out I'd guess.
I'm going to back port a bunch of stuff over the next few days.

- Mark


On Fri, Dec 20, 2013 at 9:19 AM, Mark Miller markrmil...@gmail.com wrote:

 Hey, yeah, sorry about the lack of activity on this. Been kind of sick
 this week. Hope to jump on this soon though.

 - Mark

 On Dec 20, 2013, at 9:17 AM, Jan Høydahl jan@cominvent.com wrote:

 I added a new Version 4.6.1 to the Solr and Lucene JIRA projects.

 https://issues.apache.org/jira/browse/SOLR-5564 is another low-risk fix
 candidate for 4.6.1

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com

 16. des. 2013 kl. 20:02 skrev Joel Bernstein joels...@gmail.com:

 Sounds great


 On Mon, Dec 16, 2013 at 1:47 PM, Mark Miller markrmil...@gmail.comwrote:

 Cool - let’s back port this week and I’ll put up an RC on Saturday?

 - Mark

 On Dec 16, 2013, at 1:25 PM, Joel Bernstein joels...@gmail.com wrote:

 +1

 I would like to get out some safe bug fixes to the
 CollapsingQParserPlugin.


 On Tue, Dec 3, 2013 at 11:04 AM, Mark Miller markrmil...@gmail.comwrote:

 I’d be willing to push a 4.6.1 in a couple weeks - I’d like to get a
 bunch of bug fixes out in a low risk upgrade.

 - Mark
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 Joel Bernstein
 Search Engineer at Heliosearch





 --
 Joel Bernstein
 Search Engineer at Heliosearch






-- 
- Mark


Iterating BinaryDocValues

2013-12-23 Thread Joel Bernstein
Hi,

I'm looking for a faster way to perform large scale docId - bytesRef
lookups for BinaryDocValues.

I'm finding that I can't get the performance that I need from the random
access seek in the BinaryDocValues interface.

I'm wondering if sequentially scanning the docValues would be a faster
approach. I have a BitSet of matching docs, so if I sequentially moved
through the docValues I could test each one against that bitset.

Wondering if that approach would be faster for bulk extracts and how tricky
it would be to add an iterator to the BinaryDocValues interface?

Thanks,
Joel


[jira] [Updated] (SOLR-2960) XPathEntityProcessor does not clear nulls from empty multi-valued fields

2013-12-23 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2960:
-

Attachment: SOLR-2960.patch

Here is an update of Michael Watts patch for current Trunk and also a unit 
test.  I plan to commit this soon.

 XPathEntityProcessor does not clear nulls from empty multi-valued fields
 

 Key: SOLR-2960
 URL: https://issues.apache.org/jira/browse/SOLR-2960
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Michael Watts
Assignee: James Dyer
Priority: Minor
 Attachments: SOLR-2960.patch, SOLR-2960.patch


 I can't confidently say I completeley understand all that these classes so 
 boldy tackle (that is, XPathEntityProcessor and XPathRecordReader) , but 
 there may be someone who does. Nonetheless, I think I've got some or most of 
 this right, and more likely there are more someones like that. So, I won't 
 qualify everything I say with a maybe -- lets this be the refactoring of 
 those. 
 Whenever mapping an XML file into a Solr Index, within the XPathRecordReader, 
 (used by the XPathEntityProcessor within the DataImportHandler), if (A) a 
 field is perceived to be null and is multivalued, it is pushed a value of 
 null (on top of any other values it previously had). Otherwise (B) for 
 multivalued fields, any found value is pushed onto its existing list of 
 values, and the field is marked as found within the frame (a.k.a record). 
 In general, when the end-tag of a record is seen, (C) the XPathRecordReader 
 clears all of the field's values which have been marked as found, as tidiness 
 is a value and they are supposedly no longer useful. 
 However, suppose that for a given record and multivalued field, a value is 
 never found (though it may have been found for other fields in the record), 
 only (A) will have occurred, never will (B) have occurred, the field will 
 never have been marked as found, and thus (C) never will have occurred for 
 the field. 
 So, the field will remain, with its list of nulls. 
 This list of nulls will grow until either the last record or a non-null value 
 is seen. 
 And so, (1) an out-of-memory error may occur, given sufficiently many records 
 and a mortal computer. 
 Moreover, (2), a transformer cannot reliably depend on the number of nulls in 
 the field (and this information cannot be guaranteed to be determined by some 
 other value). 
 I will try to provide more information, if this seems an issue and if there 
 doesn't seem to be an answer. 
 At this point, if I understand the problem correctly, it seems the answer is 
 to 'mark' those null fields, considering 'null' and added value. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread Christine Poerschke (JIRA)
Christine Poerschke created SOLR-5576:
-

 Summary: ZkController.java registerAllCoresAsDown multiple cores 
logic
 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke


The behaviour we saw was that considerable time elapsed between
different cores within the same solr instance publishing themselves as down.

Separately it appears from the code that some cores would not be published as 
down if another core returns from the function early because it will be its
shard leader (return vs. continue in for loop).




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5576:
--

Attachment: SOLR-5576.patch

Attaching patch to separate publish-as-down and waitForLeaderToSeeDownState 
into separate for loops. Also replacing return with continue when 
waitForLeaderToSeeDownState call can be skipped.

 ZkController.java registerAllCoresAsDown multiple cores logic
 -

 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke
 Attachments: SOLR-5576.patch


 The behaviour we saw was that considerable time elapsed between
 different cores within the same solr instance publishing themselves as down.
 Separately it appears from the code that some cores would not be published as 
 down if another core returns from the function early because it will be its
 shard leader (return vs. continue in for loop).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Christine Poerschke (JIRA)
Christine Poerschke created SOLR-5577:
-

 Summary: indexing delay due to zookeeper election
 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke


The behaviour we observed was that a zookeeper election took about 2s plus 1.5s 
for reading the zoo_data snapshot. During this time solr tried to establish 
connections to any zookeeper in the ensemble but only succeeded once a leader 
was elected *and* that leader was done reading the snapshot. Solr document 
updates were slowed down during this time window.

Is this expected to happen during and shortly after elections, that is 
zookeeper closing existing connections, rejecting new connections and thus
stalling solr updates?

Other than avoiding zookeeper elections, are there ways of reducing their
impact on solr?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5577:
--

Description: 
The behaviour we observed was that a zookeeper election took about 2s plus 1.5s 
for reading the zoo_data snapshot. During this time solr tried to establish 
connections to any zookeeper in the ensemble but only succeeded once a leader 
was elected *and* that leader was done reading the snapshot. Solr document 
updates were slowed down during this time window.

Is this expected to happen during and shortly after elections, that is 
zookeeper closing existing connections, rejecting new connections and thus 
stalling solr updates?

Other than avoiding zookeeper elections, are there ways of reducing their 
impact on solr?

+zookeeper log extract+

{noformat}
08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
08:18:56,918 [FileSnap.java:83] Reading snapshot ...
...
08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from ...
08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
due to java.io.IOException: ZooKeeperServer not running
08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client ... 
(no session established for client)
...
08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
{noformat}

+solr log extract+

{noformat}
08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from server 
sessionid ... likely server has closed socket, closing socket connection and 
attempting reconnect
08:18:55,068 [ConnectionManager.java:72] Watcher 
org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
state:Disconnected type:None path:null path:null type:None
08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
...
08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from server 
sessionid ... likely server has closed socket, closing socket connection and 
attempting reconnect
...
08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
...
08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
...
08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
...
08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
... sessionid = ..., negotiated timeout = ...
08:18:58,751 ... [ConnectionManager.java:72] Watcher 
org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
state:SyncConnected type:None path:null path:null type:None
{noformat}



  was:
The behaviour we observed was that a zookeeper election took about 2s plus 1.5s 
for reading the zoo_data snapshot. During this time solr tried to establish 
connections to any zookeeper in the ensemble but only succeeded once a leader 
was elected *and* that leader was done reading the snapshot. Solr document 
updates were slowed down during this time window.

Is this expected to happen during and shortly after elections, that is 
zookeeper closing existing connections, rejecting new connections and thus
stalling solr updates?

Other than avoiding zookeeper elections, are there ways of reducing their
impact on solr?


 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, 

[jira] [Assigned] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5576:
-

Assignee: Mark Miller

 ZkController.java registerAllCoresAsDown multiple cores logic
 -

 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5576.patch


 The behaviour we saw was that considerable time elapsed between
 different cores within the same solr instance publishing themselves as down.
 Separately it appears from the code that some cores would not be published as 
 down if another core returns from the function early because it will be its
 shard leader (return vs. continue in for loop).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5576:
--

Fix Version/s: 4.6.1
   4.7
   5.0
   Issue Type: Improvement  (was: Bug)

 ZkController.java registerAllCoresAsDown multiple cores logic
 -

 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5576.patch


 The behaviour we saw was that considerable time elapsed between
 different cores within the same solr instance publishing themselves as down.
 Separately it appears from the code that some cores would not be published as 
 down if another core returns from the function early because it will be its
 shard leader (return vs. continue in for loop).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855835#comment-13855835
 ] 

Mark Miller commented on SOLR-5577:
---

Our model should be able to handle this better.

Some off the cough remarks:

* Our model should be fine with turning off updates only after the connection 
with zk is lost for a while, rather than the moment it's noticed. 

* Even if we didn't want to relax the above, we should be able to handle this 
case better - if the issue is that ZooKeeper is actually unavailable, we won't 
get new leaders or anything anyway, so no reason to be too concerned about 
turning off updates. Not sure how easy that is to detect though.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855852#comment-13855852
 ] 

ASF subversion and git services commented on SOLR-5576:
---

Commit 1553178 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553178 ]

SOLR-5576: Improve concurrency when registering and waiting for all SolrCore's 
to register a DOWN state.

 ZkController.java registerAllCoresAsDown multiple cores logic
 -

 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5576.patch


 The behaviour we saw was that considerable time elapsed between
 different cores within the same solr instance publishing themselves as down.
 Separately it appears from the code that some cores would not be published as 
 down if another core returns from the function early because it will be its
 shard leader (return vs. continue in for loop).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5576) ZkController.java registerAllCoresAsDown multiple cores logic

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855853#comment-13855853
 ] 

ASF subversion and git services commented on SOLR-5576:
---

Commit 1553179 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1553179 ]

SOLR-5576: Improve concurrency when registering and waiting for all SolrCore's 
to register a DOWN state.

 ZkController.java registerAllCoresAsDown multiple cores logic
 -

 Key: SOLR-5576
 URL: https://issues.apache.org/jira/browse/SOLR-5576
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5576.patch


 The behaviour we saw was that considerable time elapsed between
 different cores within the same solr instance publishing themselves as down.
 Separately it appears from the code that some cores would not be published as 
 down if another core returns from the function early because it will be its
 shard leader (return vs. continue in for loop).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-12-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855859#comment-13855859
 ] 

ASF subversion and git services commented on SOLR-1301:
---

Commit 1553184 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1553184 ]

SOLR-1301: Ignore this test on Windows - there is a problem with Windows paths 
and Morphlines.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855874#comment-13855874
 ] 

Anshum Gupta commented on SOLR-5577:


A shard might update it's state while zk is away (and no one else knows about 
it) and perhaps we're trying to avoid any such cases by rejecting the updates 
for as long as zk is unavailable. There might be concerns about consistency if 
we get any less strict, or so I think.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #541: POMs out of sync

2013-12-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/541/

1 tests failed.
FAILED:  org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch

Error Message:
No live SolrServers available to handle this 
request:[http://127.0.0.1:53250/collection1, http://127.0.0.1:40770/collection1]

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[http://127.0.0.1:53250/collection1, 
http://127.0.0.1:40770/collection1]
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:283)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:640)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1325)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:542)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:521)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.brindDownShardIndexSomeDocsAndRecover(BasicDistributedZk2Test.java:305)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.doTest(BasicDistributedZk2Test.java:117)




Build Log:
[...truncated 38321 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:482: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:176: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77:
 Java returned: 1

Total time: 99 minutes 57 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855877#comment-13855877
 ] 

Ramkumar Aiyengar commented on SOLR-5577:
-

The proposal here is to only relax updating the collection for a while when ZK 
connection is lost. If a shard updates its state, wouldn't the Overseer have to 
process the state update? That would still continue to fail till ZK comes back 
up.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5567) ZkController getHostAddress duplicates url prefix

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855898#comment-13855898
 ] 

Mark Miller commented on SOLR-5567:
---

Note: Alexey spotted this and provided a fix in SOLR-3854 as well. It was never 
addressed.

 ZkController getHostAddress duplicates url prefix
 -

 Key: SOLR-5567
 URL: https://issues.apache.org/jira/browse/SOLR-5567
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6
Reporter: Kyle Halliday
Priority: Minor
 Attachments: SOLR-5567.patch

   Original Estimate: 5m
  Remaining Estimate: 5m

 The ZkController getHostAddress method will return a URL with duplicated url 
 prefix if given an input string already including a url prefix.
 e.g. given the input http://127.0.0.1;, it will return 
 http://http://127.0.0.1;



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855908#comment-13855908
 ] 

Mark Miller commented on SOLR-5577:
---

bq.  Solr document updates were slowed down during this time window.

That's interesting - slowed down? Not rejected? That is more surprising to 
me...need to do some code review.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855910#comment-13855910
 ] 

Mark Miller commented on SOLR-5577:
---

bq.  If a shard updates its state, wouldn't the Overseer have to process the 
state update? 

Correct. And there is already some window as well - I'm not sure it matters 
that the window is a little larger due to the the possibly very small change in 
probability of it being an issue. We have to think about it carefully, but for 
the most part, this is just a preventative measure to make sure some node is 
not going rogue for a long period of time with a cached, stale cluster state 
and no connection to ZooKeeper for some reason, but perhaps a connection to 
other nodes.

I think we always intended to think about ways to relax it, but when putting 
things together initially, it was faster/easier to just lock this down as much 
as possible to start.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855912#comment-13855912
 ] 

Mark Miller commented on SOLR-5577:
---

P.S. I don't know that it's the right solution for the issue yet either - just 
spit balling at this point.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855916#comment-13855916
 ] 

Mark Miller commented on SOLR-5577:
---

bq. That's interesting - slowed down? Not rejected? That is more surprising to 
me...need to do some code review.

Okay, bad memory. Looking at the code, this makes sense. Those updates wait up 
to the session expiration before they would end up erroring...a slowdown makes 
sense.

We have to be able to improve this.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855919#comment-13855919
 ] 

Mark Miller commented on SOLR-5577:
---

So I think we already kind of have this window setup for this case - up to the 
session timeout, which I think makes sense. The problem is in how it's 
implemented. It shouldn't hold up updates for that long, it should simply only 
accept them for that long.

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke

 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5577) indexing delay due to zookeeper election

2013-12-23 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5577:
--

  Component/s: SolrCloud
Fix Version/s: 4.7
   5.0
 Assignee: Mark Miller

 indexing delay due to zookeeper election
 

 Key: SOLR-5577
 URL: https://issues.apache.org/jira/browse/SOLR-5577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Christine Poerschke
Assignee: Mark Miller
 Fix For: 5.0, 4.7


 The behaviour we observed was that a zookeeper election took about 2s plus 
 1.5s for reading the zoo_data snapshot. During this time solr tried to 
 establish connections to any zookeeper in the ensemble but only succeeded 
 once a leader was elected *and* that leader was done reading the snapshot. 
 Solr document updates were slowed down during this time window.
 Is this expected to happen during and shortly after elections, that is 
 zookeeper closing existing connections, rejecting new connections and thus 
 stalling solr updates?
 Other than avoiding zookeeper elections, are there ways of reducing their 
 impact on solr?
 +zookeeper log extract+
 {noformat}
 08:18:54,968 [QuorumCnxManager.java:762] Connection broken for id ...
 08:18:56,916 [Leader.java:345] LEADING - LEADER ELECTION TOOK - 1941
 08:18:56,918 [FileSnap.java:83] Reading snapshot ...
 ...
 08:18:57,010 [NIOServerCnxnFactory.java:197] Accepted socket connection from 
 ...
 08:18:57,010 [NIOServerCnxn.java:354] Exception causing close of session 0x0 
 due to java.io.IOException: ZooKeeperServer not running
 08:18:57,010 [NIOServerCnxn.java:1001] Closed socket connection for client 
 ... (no session established for client)
 ...
 08:18:58,496 [FileTxnSnapLog.java:240] Snapshotting: ... to ...
 {noformat}
 +solr log extract+
 {noformat}
 08:18:54,968 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 08:18:55,068 [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:Disconnected type:None path:null path:null type:None
 08:18:55,068 [ConnectionManager.java:132] zkClient has disconnected
 ...
 08:18:55,961 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:55,961 [ClientCnxn.java:849] Socket connection established to ...
 08:18:55,962 [ClientCnxn.java:1085] Unable to read additional data from 
 server sessionid ... likely server has closed socket, closing socket 
 connection and attempting reconnect
 ...
 08:18:56,714 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:56,715 [ClientCnxn.java:849] Socket connection established to ...
 08:18:56,715 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:57,640 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:57,641 [ClientCnxn.java:849] Socket connection established to ...
 08:18:57,641 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,352 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,353 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,353 [ClientCnxn.java:1085] Unable to read additional data from ...
 ...
 08:18:58,749 [ClientCnxn.java:966] Opening socket connection to server ...
 08:18:58,749 [ClientCnxn.java:849] Socket connection established to ...
 08:18:58,751 [ClientCnxn.java:1207] Session establishment complete on server 
 ... sessionid = ..., negotiated timeout = ...
 08:18:58,751 ... [ConnectionManager.java:72] Watcher 
 org.apache.solr.common.cloud.ConnectionManager@... name:ZooKeeperConnection 
 Watcher:host1:port1,host2:port2,host3:port3,... got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input

2013-12-23 Thread David Goldfarb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Goldfarb updated LUCENE-4072:
---

Attachment: LUCENE-4072.patch

This patch dodges the use of hasBoundaryAfter, and the tests pass.

Note in doTestMode there's a clause that checks if the normalized string has 
length zero. It seems the nfkc_cf-normalized output of some strings is empty. 
Examples I found:
'\uDB40\uDCD9'
'\uDB43\uDF86'
'\uFE04'

 CharFilter that Unicode-normalizes input
 

 Key: LUCENE-4072
 URL: https://issues.apache.org/jira/browse/LUCENE-4072
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Ippei UKAI
 Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, 
 LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, 
 ippeiukai-ICUNormalizer2CharFilter-4752cad.zip


 I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
 The benefit of having this process as CharFilter is that tokenizer can work 
 on normalised text while offset-correction ensuring fast vector highlighter 
 and other offset-dependent features do not break.
 The implementation is available at following repository:
 https://github.com/ippeiukai/ICUNormalizer2CharFilter
 Unfortunately this is my unpaid side-project and cannot spend much time to 
 merge my work to Lucene to make appropriate patch. I'd appreciate it if 
 anyone could give it a go. I'm happy to relicense it to whatever that meets 
 your needs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-23 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856002#comment-13856002
 ] 

Timothy Potter commented on SOLR-4260:
--

Found another interesting case that may or may not be valid depending on 
whether we think HTTP requests between a leader and replica can fail even if 
the ZooKeeper session on the replica does not drop?

Specifically, what I'm seeing is that if an update request between the leader 
and replica fails, but the replica doesn't lose it's session with ZK, then the 
replica can get out-of-sync with the leader. In a real network partition, the 
ZK connection would also likely be lost and the replica would get marked as 
down. So as long as the HTTP connection timeout between the leader and replica 
exceeds the ZK client timeout, the replica would probably recover correctly, 
rendering this test case invalid. So maybe the main question here is whether we 
think it's possible for HTTP requests between a leader and replica to fail even 
though the ZooKeeper connection stays alive?

Here's the steps I used to reproduce this case (all using revision 1553150 in 
branch_4x):

* STEP 1: Setup a collection named “cloud” containing 1 shard and 2 replicas 
on hosts: cloud84 (127.0.0.1:8984) and cloud85 (127.0.0.1:8985)*

SOLR_TOP=/home/ec2-user/branch_4x/solr
$SOLR_TOP/cloud84/cloud-scripts/zkcli.sh -zkhost $ZK_HOST -cmd upconfig 
-confdir $SOLR_TOP/cloud84/solr/cloud/conf -confname cloud
API=http://localhost:8984/solr/admin/collections
curl -v 
$API?action=CREATEname=cloudreplicationFactor=2numShards=1collection.configName=cloud

Replica on cloud84 is elected as the initial leader. /clusterstate.json looks 
like:

{cloud:{
shards:{shard1:{
range:8000-7fff,
state:active,
replicas:{
  core_node1:{
state:active,
base_url:http://cloud84:8984/solr;,
core:cloud_shard1_replica1,
node_name:cloud84:8984_solr,
leader:true},
  core_node2:{
state:active,
base_url:http://cloud85:8985/solr;,
core:cloud_shard1_replica2,
node_name:cloud85:8985_solr,
maxShardsPerNode:1,
router:{name:compositeId},
replicationFactor:2}}


* STEP 2: Simulate network partition*

sudo iptables -I INPUT 1 -i lo -p tcp --sport 8985 -j DROP; sudo iptables -I 
INPUT 2 -i lo -p tcp --dport 8985 -j DROP

Various ways to do this, but to keep it simple, I'm just dropping inbound 
traffic on localhost to port 8985.

* STEP 3: Send document with ID “doc1” to leader on cloud84*

curl http://localhost:8984/solr/cloud/update; -H 
'Content-type:application/xml' \
  --data-binary 'adddocfield name=iddoc1/fieldfield 
name=foo_sbar/field/doc/add'

The update request takes some time because the replica is down but ultimately 
succeeds on the leader. In the logs on the leader, we have (some stack trace 
lines removed for clarity):

2013-12-23 10:59:33,688 [updateExecutor-1-thread-1] ERROR 
solr.update.StreamingSolrServers  - error
org.apache.http.conn.HttpHostConnectException: Connection to 
http://cloud85:8985 refused
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
...
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232)
...
Caused by: java.net.ConnectException: Connection timed out
...
2013-12-23 10:59:33,695 [qtp1073932139-16] INFO  
update.processor.LogUpdateProcessor  - [cloud_shard1_replica1] webapp=/solr 
path=/update params={} {add=[doc1 (1455228778490888192)]} 0 63256
2013-12-23 10:59:33,702 [updateExecutor-1-thread-2] INFO  
update.processor.DistributedUpdateProcessor  - try and ask 
http://cloud85:8985/solr to recover
2013-12-23 10:59:48,718 [updateExecutor-1-thread-2] ERROR 
update.processor.DistributedUpdateProcessor  - http://cloud85:8985/solr: Could 
not tell a replica to recover:org.apache.solr.client.solrj.SolrServerException: 
IOException occured when talking to server at: http://cloud85:8985/solr
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:507)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor$1.run(DistributedUpdateProcessor.java:657)
...
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to 
cloud85:8985 timed out
...

Of course these log messages are expected. The key is that the leader accepted 
the update and now has one doc with ID doc1

 STEP 4: Heal the network partition

sudo service iptables restart (undoes the DROP rules we added above)

* STEP 5: Send document with ID “doc2” to leader on cloud84*

curl http://localhost:8984/solr/cloud/update; -H 

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856009#comment-13856009
 ] 

Mark Miller commented on SOLR-4260:
---

Yeah, that's currently expected. We don't expect the case where you can talk to 
ZooKeeper but not your replicas to be common, so we kind of punted on this 
scenario for the first phase.

Some related JIRA issues:

SOLR-5482
SOLR-5450
SOLR-5495   

I think we should do all that, but the key is really, in this case, we need to 
pass the order to recover through ZooKeeper to the partitioned off replica. 
With an eventually consistent model, it can be off for a short time, but it 
needs to recover in a timely manner.

I think this is the right solution because the replica is sure to either get 
the information to recover from ZooKeeper or lose it's connection to ZooKeeper 
in which case it will have to recover anyway.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, 4.7

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856013#comment-13856013
 ] 

Mark Miller commented on SOLR-4260:
---

bq. so we kind of punted

The other thing to note is that if you restart the shard or that node or the 
cluster, you should be able to do it without losing any data. It will recover 
from the leader when everything else is working correctly.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, 4.7

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5564) hl.maxAlternateFieldLength should apply to original field when fallback field does not exist

2013-12-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-5564:
--

Summary: hl.maxAlternateFieldLength should apply to original field when 
fallback field does not exist  (was: hl.maxAlternateFieldLength should apply to 
original field when fallback is attempted)

 hl.maxAlternateFieldLength should apply to original field when fallback field 
 does not exist
 

 Key: SOLR-5564
 URL: https://issues.apache.org/jira/browse/SOLR-5564
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Reporter: Jan Høydahl
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5564.patch


 For a customer we use 
 {{f.body.hl.alternateField=teaserhl.maxAlternateFieldLength=100}}
 But some articles do not have the teaser field filled at all, so for queries 
 that do not match the body, we get the full huge body returned in the 
 frontend.
 If the highlighter has tried to fallback to the alternateField, then 
 hl.maxAlternateFieldLength should always apply, even to text from the 
 original field if alternateFIeld does not exist.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5369) Add an UpperCaseFilter

2013-12-23 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856049#comment-13856049
 ] 

Ryan McKinley commented on LUCENE-5369:
---

Unless I hear objections, I would like to commit in the next few weeks

thanks
ryan

 Add an UpperCaseFilter
 --

 Key: LUCENE-5369
 URL: https://issues.apache.org/jira/browse/LUCENE-5369
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-5369-uppercase-filter.patch


 We should offer a standard way to force upper-case tokens.  I understand that 
 lowercase is safer for general search quality because some uppercase 
 characters can represent multiple lowercase ones.
 However, having upper-case tokens is often nice for faceting (consider 
 normalizing to standard acronyms)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-23 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856052#comment-13856052
 ] 

Timothy Potter commented on SOLR-4260:
--

Thanks Mark, I suspected my test case was a little cherry picked ... something 
interesting happened when I also severed the connection between the replica and 
ZK (ie. same test as above but I also dropped the ZK connection on the replica).

2013-12-23 15:39:57,170 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 
name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 
got event WatchedEvent state:Disconnected type:None path:null path:null 
type:None
2013-12-23 15:39:57,170 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - zkClient has disconnected

 fixed the connection between replica and ZK here 

2013-12-23 15:40:45,579 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 
name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 
got event WatchedEvent state:Expired type:None path:null path:null type:None
2013-12-23 15:40:45,579 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Our previous ZooKeeper session was expired. Attempting to reconnect to 
recover relationship with ZooKeeper...
2013-12-23 15:40:45,580 [main-EventThread] INFO  
common.cloud.DefaultConnectionStrategy  - Connection expired - starting a new 
one...
2013-12-23 15:40:45,586 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Waiting for client to connect to ZooKeeper
2013-12-23 15:40:45,595 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Watcher org.apache.solr.common.cloud.ConnectionManager@4f857c62 
name:ZooKeeperConnection Watcher:ec2-54-197-0-103.compute-1.amazonaws.com:2181 
got event WatchedEvent state:SyncConnected type:None path:null path:null 
type:None
2013-12-23 15:40:45,595 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Client is connected to ZooKeeper
2013-12-23 15:40:45,595 [main-EventThread] INFO  common.cloud.ConnectionManager 
 - Connection with ZooKeeper reestablished.
2013-12-23 15:40:45,596 [main-EventThread] WARN  solr.cloud.RecoveryStrategy  - 
Stopping recovery for zkNodeName=core_node3core=cloud_shard1_replica3
2013-12-23 15:40:45,597 [main-EventThread] INFO  solr.cloud.ZkController  - 
publishing core=cloud_shard1_replica3 state=down
2013-12-23 15:40:45,597 [main-EventThread] INFO  solr.cloud.ZkController  - 
numShards not found on descriptor - reading it from system property
2013-12-23 15:40:45,905 [qtp2124890785-14] INFO  handler.admin.CoreAdminHandler 
 - It has been requested that we recover
2013-12-23 15:40:45,906 [qtp2124890785-14] INFO  
solr.servlet.SolrDispatchFilter  - [admin] webapp=null path=/admin/cores 
params={action=REQUESTRECOVERYcore=cloud_shard1_replica3wt=javabinversion=2} 
status=0 QTime=2 
2013-12-23 15:40:45,909 [Thread-17] INFO  solr.cloud.ZkController  - publishing 
core=cloud_shard1_replica3 state=recovering
2013-12-23 15:40:45,909 [Thread-17] INFO  solr.cloud.ZkController  - numShards 
not found on descriptor - reading it from system property
2013-12-23 15:40:45,920 [Thread-17] INFO  solr.update.DefaultSolrCoreState  - 
Running recovery - first canceling any ongoing recovery
2013-12-23 15:40:45,921 [RecoveryThread] INFO  solr.cloud.RecoveryStrategy  - 
Starting recovery process.  core=cloud_shard1_replica3 
recoveringAfterStartup=false
2013-12-23 15:40:45,924 [RecoveryThread] INFO  solr.cloud.ZkController  - 
publishing core=cloud_shard1_replica3 state=recovering
2013-12-23 15:40:45,924 [RecoveryThread] INFO  solr.cloud.ZkController  - 
numShards not found on descriptor - reading it from system property
2013-12-23 15:40:48,613 [qtp2124890785-15] INFO  solr.core.SolrCore  - 
[cloud_shard1_replica3] webapp=/solr path=/select 
params={q=foo_s:bardistrib=falsewt=jsonrows=0} hits=0 status=0 QTime=1 
2013-12-23 15:42:42,770 [qtp2124890785-13] INFO  solr.core.SolrCore  - 
[cloud_shard1_replica3] webapp=/solr path=/select 
params={q=foo_s:bardistrib=falsewt=jsonrows=0} hits=0 status=0 QTime=1 
2013-12-23 15:42:45,650 [main-EventThread] ERROR solr.cloud.ZkController  - 
There was a problem making a request to the 
leader:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I 
was asked to wait on state down for cloud86:8986_solr but I still do not see 
the requested state. I see state: recovering live:false
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
at 
org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1434)
at 
org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:347)
 

[jira] [Commented] (SOLR-5552) Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.

2013-12-23 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856062#comment-13856062
 ] 

Timothy Potter commented on SOLR-5552:
--

Glad it was helpful even though my patch was crap ;-) I'll test against trunk 
in my env as well. Thanks.

 Leader recovery process can select the wrong leader if all replicas for a 
 shard are down and trying to recover as well as lose updates that should have 
 been recovered.
 ---

 Key: SOLR-5552
 URL: https://issues.apache.org/jira/browse/SOLR-5552
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Timothy Potter
Assignee: Mark Miller
Priority: Critical
  Labels: leader, recovery
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5552.patch, SOLR-5552.patch


 One particular issue that leads to out-of-sync shards, related to SOLR-4260
 Here's what I know so far, which admittedly isn't much:
 As cloud85 (replica before it crashed) is initializing, it enters the wait 
 process in ShardLeaderElectionContext#waitForReplicasToComeUp; this is 
 expected and a good thing.
 Some short amount of time in the future, cloud84 (leader before it crashed) 
 begins initializing and gets to a point where it adds itself as a possible 
 leader for the shard (by creating a znode under 
 /collections/cloud/leaders_elect/shard1/election), which leads to cloud85 
 being able to return from waitForReplicasToComeUp and try to determine who 
 should be the leader.
 cloud85 then tries to run the SyncStrategy, which can never work because in 
 this scenario the Jetty HTTP listener is not active yet on either node, so 
 all replication work that uses HTTP requests fails on both nodes ... PeerSync 
 treats these failures as indicators that the other replicas in the shard are 
 unavailable (or whatever) and assumes success. Here's the log message:
 2013-12-11 11:43:25,936 [coreLoadExecutor-3-thread-1] WARN 
 solr.update.PeerSync - PeerSync: core=cloud_shard1_replica1 
 url=http://cloud85:8985/solr couldn't connect to 
 http://cloud84:8984/solr/cloud_shard1_replica2/, counting as success
 The Jetty HTTP listener doesn't start accepting connections until long after 
 this process has completed and already selected the wrong leader.
 From what I can see, we seem to have a leader recovery process that is based 
 partly on HTTP requests to the other nodes, but the HTTP listener on those 
 nodes isn't active yet. We need a leader recovery process that doesn't rely 
 on HTTP requests. Perhaps, leader recovery for a shard w/o a current leader 
 may need to work differently than leader election in a shard that has 
 replicas that can respond to HTTP requests? All of what I'm seeing makes 
 perfect sense for leader election when there are active replicas and the 
 current leader fails.
 All this aside, I'm not asserting that this is the only cause for the 
 out-of-sync issues reported in this ticket, but it definitely seems like it 
 could happen in a real cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-23 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856078#comment-13856078
 ] 

Mark Miller commented on SOLR-4260:
---

That's interesting. The logging makes it look like it's not creating it's new 
ephemeral live node for some reason...or the leader is not getting an updated 
view of the live node...

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, 4.7

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-23 Thread Joshua Hartman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Hartman updated LUCENE-5372:
---

Attachment: 5372-lucene5339.patch
5372-v2.patch

Uploaded versions of the patch for both trunk and lucene5339 branches for 
replacing StringBuffer - StringBuilder. Due to API constraints it is not 
possible to do so in all cases.

Mike was also correct. Code related to the specific O(n^2) issue this JIRA was 
started for no longer exists on lucene5339.

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372-lucene5339.patch, 5372-v2.patch, 5372.patch, 
 LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1059: POMs out of sync

2013-12-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1059/

1 tests failed.
FAILED:  org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch

Error Message:
No live SolrServers available to handle this 
request:[http://127.0.0.1:15369/ky_kg/collection1, 
http://127.0.0.1:12475/ky_kg/collection1]

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[http://127.0.0.1:15369/ky_kg/collection1, 
http://127.0.0.1:12475/ky_kg/collection1]
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:283)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:640)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1325)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:542)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:521)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.brindDownShardIndexSomeDocsAndRecover(BasicDistributedZk2Test.java:305)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.doTest(BasicDistributedZk2Test.java:117)




Build Log:
[...truncated 52608 lines...]
  [mvn] [INFO] -
  [mvn] [INFO] -
  [mvn] [ERROR] COMPILATION ERROR : 
  [mvn] [INFO] -

[...truncated 279 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:476: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:176: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77:
 Java returned: 1

Total time: 111 minutes 23 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned

2013-12-23 Thread Littlestar (JIRA)
Littlestar created LUCENE-5377:
--

 Summary: Lucene mixed index segments cause segment info file(.si) 
unversioned
 Key: LUCENE-5377
 URL: https://issues.apache.org/jira/browse/LUCENE-5377
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.6
 Environment: windows/linux
Reporter: Littlestar


my old facet index create by Lucene version=4.2
use indexChecker ok.

now I upgrade to Lucene 4.6 and put some new records to index.
then reopen index, some files in indexdir missing
no .si files.

I debug into it,  new version format of segments.gen(segments_N) record bad 
segments info.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned

2013-12-23 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856110#comment-13856110
 ] 

Littlestar commented on LUCENE-5377:


Lucene 4.5/4.5.1 is ok.


 Lucene mixed index segments cause segment info file(.si) unversioned
 

 Key: LUCENE-5377
 URL: https://issues.apache.org/jira/browse/LUCENE-5377
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.6
 Environment: windows/linux
Reporter: Littlestar

 my old facet index create by Lucene version=4.2
 use indexChecker ok.
 now I upgrade to Lucene 4.6 and put some new records to index.
 then reopen index, some files in indexdir missing
 no .si files.
 I debug into it,  new version format of segments.gen(segments_N) record bad 
 segments info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5377) Lucene mixed index segments cause segment info file(.si) unversioned

2013-12-23 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856110#comment-13856110
 ] 

Littlestar edited comment on LUCENE-5377 at 12/24/13 4:07 AM:
--

Lucene 4.5/4.5.1 is ok.
but failed in 4.6.0


was (Author: cnstar9988):
Lucene 4.5/4.5.1 is ok.


 Lucene mixed index segments cause segment info file(.si) unversioned
 

 Key: LUCENE-5377
 URL: https://issues.apache.org/jira/browse/LUCENE-5377
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.6
 Environment: windows/linux
Reporter: Littlestar

 my old facet index create by Lucene version=4.2
 use indexChecker ok.
 now I upgrade to Lucene 4.6 and put some new records to index.
 then reopen index, some files in indexdir missing
 no .si files.
 I debug into it,  new version format of segments.gen(segments_N) record bad 
 segments info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5377) Lucene mixed version segments cause segment info file(.si) wrong

2013-12-23 Thread Littlestar (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Littlestar updated LUCENE-5377:
---

Summary: Lucene mixed version segments cause segment info file(.si) wrong  
(was: Lucene mixed index segments cause segment info file(.si) unversioned)

 Lucene mixed version segments cause segment info file(.si) wrong
 

 Key: LUCENE-5377
 URL: https://issues.apache.org/jira/browse/LUCENE-5377
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.6
 Environment: windows/linux
Reporter: Littlestar

 my old facet index create by Lucene version=4.2
 use indexChecker ok.
 now I upgrade to Lucene 4.6 and put some new records to index.
 then reopen index, some files in indexdir missing
 no .si files.
 I debug into it,  new version format of segments.gen(segments_N) record bad 
 segments info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input

2013-12-23 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4072:


Attachment: LUCENE-4072.patch

Thanks so much for attacking this David: I think that 0-length all default 
ignorables case makes sense (where it creates an empty string), because in 
that case there won't be a single token at all (MockTokenizer is not a perfect 
emulator of KeywordTokenizer here).

I think this patch is close, but when running the test a few hundred times I 
hit a failure (see my added testCuriousString, which fails). I think this one 
is a bug in the logic.

Motivated by this fail, I tried to beef up tests in general:
* fixed my typo where testNFD wasnt actually testing NFD
* test strings  20 characters, since this filter has an internal 128-char 
buffer.

The latter seems to expose a lot of bugs, I assume due to the internal 
buffering. I haven't yet looked into this. But it seems there are correctness 
issues for documents  128 chars (as well as what I believe is a separate bug 
seen by testCuriousString, which I think is some bug in the logic related to 
ignorables).


 CharFilter that Unicode-normalizes input
 

 Key: LUCENE-4072
 URL: https://issues.apache.org/jira/browse/LUCENE-4072
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Ippei UKAI
 Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, 
 LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, 
 ippeiukai-ICUNormalizer2CharFilter-4752cad.zip


 I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
 The benefit of having this process as CharFilter is that tokenizer can work 
 on normalised text while offset-correction ensuring fast vector highlighter 
 and other offset-dependent features do not break.
 The implementation is available at following repository:
 https://github.com/ippeiukai/ICUNormalizer2CharFilter
 Unfortunately this is my unpaid side-project and cannot spend much time to 
 merge my work to Lucene to make appropriate patch. I'd appreciate it if 
 anyone could give it a go. I'm happy to relicense it to whatever that meets 
 your needs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4072) CharFilter that Unicode-normalizes input

2013-12-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856140#comment-13856140
 ] 

Robert Muir commented on LUCENE-4072:
-

ok as for the testCuriousString bug, I enabled verbose (ant test 
-Dtestcase=TestICUNormalizer2CharFilter -Dtestmethod=testCuriousString 
-Dtests.verbose=true) and it seems to always fail when given a spoon-fed 
Reader. So Ill dig into this one, I think it involves how this charfilter 
consumes the reader api.


 CharFilter that Unicode-normalizes input
 

 Key: LUCENE-4072
 URL: https://issues.apache.org/jira/browse/LUCENE-4072
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Ippei UKAI
 Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, 
 LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, 
 ippeiukai-ICUNormalizer2CharFilter-4752cad.zip


 I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
 The benefit of having this process as CharFilter is that tokenizer can work 
 on normalised text while offset-correction ensuring fast vector highlighter 
 and other offset-dependent features do not break.
 The implementation is available at following repository:
 https://github.com/ippeiukai/ICUNormalizer2CharFilter
 Unfortunately this is my unpaid side-project and cannot spend much time to 
 merge my work to Lucene to make appropriate patch. I'd appreciate it if 
 anyone could give it a go. I'm happy to relicense it to whatever that meets 
 your needs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4072) CharFilter that Unicode-normalizes input

2013-12-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856149#comment-13856149
 ] 

Robert Muir commented on LUCENE-4072:
-

One thing that certainly looks like a bug is this:

The input-processing side looks like this in pseudocode:
{code}
while (read() some char[]s) {
   normalize(char[]s) // (quick check/hasBoundary/etc)
}
{code}

But read() works at char level, and these normalization apis want ints. 
So I think readInputToBuffer() needs to keep reading, if possible, to ensure it 
fully consumes whole codepoints before returning. I added a little hack 
locally, but it didnt seem to clean up the test fails, so I think there are 
other bugs too, or I'm missing something?

{code}
  private int readInputToBuffer() throws IOException {
final int len = input.read(tmpBuffer);
if (len == -1) {
  inputFinished = true;
  return 0;
}
inputBuffer.append(tmpBuffer, 0, len);
// nocommit: just a hack
// if buffer ends on high surrogate, keep reading before processing
if (len  0  Character.isHighSurrogate(tmpBuffer[len-1])) {
  return len + readInputToBuffer();
}
// end hack
return len;
  }
{code}


 CharFilter that Unicode-normalizes input
 

 Key: LUCENE-4072
 URL: https://issues.apache.org/jira/browse/LUCENE-4072
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Ippei UKAI
 Attachments: DebugCode.txt, LUCENE-4072.patch, LUCENE-4072.patch, 
 LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch, 
 ippeiukai-ICUNormalizer2CharFilter-4752cad.zip


 I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
 The benefit of having this process as CharFilter is that tokenizer can work 
 on normalised text while offset-correction ensuring fast vector highlighter 
 and other offset-dependent features do not break.
 The implementation is available at following repository:
 https://github.com/ippeiukai/ICUNormalizer2CharFilter
 Unfortunately this is my unpaid side-project and cannot spend much time to 
 merge my work to Lucene to make appropriate patch. I'd appreciate it if 
 anyone could give it a go. I'm happy to relicense it to whatever that meets 
 your needs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5552) Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.

2013-12-23 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856159#comment-13856159
 ] 

Timothy Potter commented on SOLR-5552:
--

Ran my manual test process on trunk and could not reproduce the out-of-sync 
issue! From the logs, the recovery process definitely starts after the HTTP 
listener is up. Looking good on trunk.

 Leader recovery process can select the wrong leader if all replicas for a 
 shard are down and trying to recover as well as lose updates that should have 
 been recovered.
 ---

 Key: SOLR-5552
 URL: https://issues.apache.org/jira/browse/SOLR-5552
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Timothy Potter
Assignee: Mark Miller
Priority: Critical
  Labels: leader, recovery
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5552.patch, SOLR-5552.patch


 One particular issue that leads to out-of-sync shards, related to SOLR-4260
 Here's what I know so far, which admittedly isn't much:
 As cloud85 (replica before it crashed) is initializing, it enters the wait 
 process in ShardLeaderElectionContext#waitForReplicasToComeUp; this is 
 expected and a good thing.
 Some short amount of time in the future, cloud84 (leader before it crashed) 
 begins initializing and gets to a point where it adds itself as a possible 
 leader for the shard (by creating a znode under 
 /collections/cloud/leaders_elect/shard1/election), which leads to cloud85 
 being able to return from waitForReplicasToComeUp and try to determine who 
 should be the leader.
 cloud85 then tries to run the SyncStrategy, which can never work because in 
 this scenario the Jetty HTTP listener is not active yet on either node, so 
 all replication work that uses HTTP requests fails on both nodes ... PeerSync 
 treats these failures as indicators that the other replicas in the shard are 
 unavailable (or whatever) and assumes success. Here's the log message:
 2013-12-11 11:43:25,936 [coreLoadExecutor-3-thread-1] WARN 
 solr.update.PeerSync - PeerSync: core=cloud_shard1_replica1 
 url=http://cloud85:8985/solr couldn't connect to 
 http://cloud84:8984/solr/cloud_shard1_replica2/, counting as success
 The Jetty HTTP listener doesn't start accepting connections until long after 
 this process has completed and already selected the wrong leader.
 From what I can see, we seem to have a leader recovery process that is based 
 partly on HTTP requests to the other nodes, but the HTTP listener on those 
 nodes isn't active yet. We need a leader recovery process that doesn't rely 
 on HTTP requests. Perhaps, leader recovery for a shard w/o a current leader 
 may need to work differently than leader election in a shard that has 
 replicas that can respond to HTTP requests? All of what I'm seeing makes 
 perfect sense for leader election when there are active replicas and the 
 current leader fails.
 All this aside, I'm not asserting that this is the only cause for the 
 out-of-sync issues reported in this ticket, but it definitely seems like it 
 could happen in a real cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5367) NoSuchElementException occurs when org.apache.lucene.facet.index.FacetFields is used.

2013-12-23 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856187#comment-13856187
 ] 

Shai Erera commented on LUCENE-5367:


Lucien, do you have a testcase which reproduces the error? If not, I'll close 
the issue.

 NoSuchElementException occurs when org.apache.lucene.facet.index.FacetFields 
 is used.
 -

 Key: LUCENE-5367
 URL: https://issues.apache.org/jira/browse/LUCENE-5367
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Affects Versions: 4.2.1, 4.6
Reporter: Lucien Pereira

 Hi,
 When I use the API as below :
 {code}
 ListCategoryPath categories = Collections.CategoryPathsingletonList(new 
 CategoryPath(path.toArray(new String[path.size()])));
 FacetFields facetFields = new FacetFields(taxonomyWriter);
 facetFields.addFields(document, categories);
 taxonomyWriter.commit();
 {code}
 An exception occurs :
 {quote}
 java.util.NoSuchElementException
   at java.util.Collections$1.next(Collections.java:3302)
   at 
 org.apache.lucene.facet.index.DrillDownStream.reset(DrillDownStream.java:78)
   at 
 org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:97)
   at 
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
   at 
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:453)
   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1520)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1190)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1171)
 {quote}
 Seems likes this is due to multiple calls to 
 org.apache.lucene.facet.index.DrillDownStream#reset which invoques #next() on 
 an 'used' iterator.
 Regards,
 Lucien 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2013-12-23 Thread rashi gandhi (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856194#comment-13856194
 ] 

rashi gandhi commented on LUCENE-2899:
--

Hi,

I have successfully applied LUCENE-2899.patch to SOLR-4.5.1 and its working 
properly.
Now , my requirement is to combine OpenNLP with jwnl.
Is it possible to combine OpenNLP with jwnl and what are the changes required 
in SOLR schema.xml for the same?
Kindly provide some pointers to move ahead.

Thanks in Advance

 Add OpenNLP Analysis capabilities as a module
 -

 Key: LUCENE-2899
 URL: https://issues.apache.org/jira/browse/LUCENE-2899
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 4.7

 Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, 
 OpenNLPFilter.java, OpenNLPTokenizer.java


 Now that OpenNLP is an ASF project and has a nice license, it would be nice 
 to have a submodule (under analysis) that exposed capabilities for it. Drew 
 Farris, Tom Morton and I have code that does:
 * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
 would have to change slightly to buffer tokens)
 * NamedEntity recognition as a TokenFilter
 We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
 either payloads (PartOfSpeechAttribute?) on a token or at the same position.
 I'd propose it go under:
 modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org