[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761648#comment-13761648 ] Mark Miller commented on SOLR-4816: --- Also, FYI, there are a few remaining issues to smooth out, so a handful of non solrcloud tests in the solrj package are failing. I'll have a second pass up that resolves these remaining issues before long. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4816: -- Attachment: SOLR-4816.patch Here is my first pass on top of Joel's work. Comments to come. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5006) CREATESHARD command for 'implicit' shards
[ https://issues.apache.org/jira/browse/SOLR-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761625#comment-13761625 ] Noble Paul commented on SOLR-5006: -- Let's open a separate issue for the ref guide > CREATESHARD command for 'implicit' shards > - > > Key: SOLR-5006 > URL: https://issues.apache.org/jira/browse/SOLR-5006 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 4.5, 5.0 > > > Custom sharding requires a CREATESHARD/DELETESHARD commands > It may not be applicable to hash based sharding -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5006) CREATESHARD command for 'implicit' shards
[ https://issues.apache.org/jira/browse/SOLR-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761623#comment-13761623 ] Noble Paul commented on SOLR-5006: -- Yes, It's an omission. Thanks for pointing it out > CREATESHARD command for 'implicit' shards > - > > Key: SOLR-5006 > URL: https://issues.apache.org/jira/browse/SOLR-5006 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 4.5, 5.0 > > > Custom sharding requires a CREATESHARD/DELETESHARD commands > It may not be applicable to hash based sharding -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5006) CREATESHARD command for 'implicit' shards
[ https://issues.apache.org/jira/browse/SOLR-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761620#comment-13761620 ] Jack Krupansky commented on SOLR-5006: -- The OverseerCollectionProcessor#createShard method supports the createNodeSet parameter, but the CollectionsHandler#handleCreateShard method does not copy that parameter from the request. Is this an oversight and intended feature for 4.5, or dead code, or just for future enhancement? Also, action=CREATESHARD and action=DELETESHARD need to be added to the Solr refGuide. > CREATESHARD command for 'implicit' shards > - > > Key: SOLR-5006 > URL: https://issues.apache.org/jira/browse/SOLR-5006 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 4.5, 5.0 > > > Custom sharding requires a CREATESHARD/DELETESHARD commands > It may not be applicable to hash based sharding -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761604#comment-13761604 ] Kranti Parisa edited comment on SOLR-4465 at 9/9/13 4:30 AM: - Does any of those tickets support configurable collectors and choosing them dynamically thru request params? Is SOLR-5045 the one to use? If so, how does it work if I don't want to aggregate by any field, but want to do custom collecting/mixing. was (Author: krantiparisa): Does any of those tickets support configurable collectors and choosing them dynamically thru request params? Is SOLR-5045 the one to use? > Configurable Collectors > --- > > Key: SOLR-4465 > URL: https://issues.apache.org/jira/browse/SOLR-4465 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.1 >Reporter: Joel Bernstein > Fix For: 4.5, 5.0 > > Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch > > > This ticket provides a patch to add pluggable collectors to Solr. This patch > was generated and tested with Solr 4.1. > This is how the patch functions: > Collectors are plugged into Solr in the solconfig.xml using the new > collectorFactory element. For example: > > > The elements above define two collector factories. The first one is the > "default" collectorFactory. The class attribute points to > org.apache.solr.handler.component.CollectorFactory, which implements logic > that returns the default TopScoreDocCollector and TopFieldCollector. > To create your own collectorFactory you must subclass the default > CollectorFactory and at a minimum override the getCollector method to return > your new collector. > The parameter "cl" turns on pluggable collectors: > cl=true > If cl is not in the parameters, Solr will automatically use the default > collectorFactory. > *Pluggable Doclist Sorting With the Docs Collector* > You can specify two types of pluggable collectors. The first type is the docs > collector. For example: > cl.docs= > The above param points to a named collectorFactory in the solrconfig.xml to > construct the collector. The docs collectorFactorys must return a collector > that extends the TopDocsCollector base class. Docs collectors are responsible > for collecting the doclist. > You can specify only one docs collector per query. > You can pass parameters to the docs collector using local params syntax. For > example: > cl.docs=\{! sort=mycustomesort\}mycollector > If cl=true and a docs collector is not specified, Solr will use the default > collectorFactory to create the docs collector. > *Pluggable Custom Analytics With Delegating Collectors* > You can also specify any number of custom analytic collectors with the > "cl.analytic" parameter. Analytic collectors are designed to collect > something else besides the doclist. Typically this would be some type of > custom analytic. For example: > cl.analytic=sum > The parameter above specifies a analytic collector named sum. Like the docs > collectors, "sum" points to a named collectorFactory in the solrconfig.xml. > You can specificy any number of analytic collectors by adding additional > cl.analytic parameters. > Analytic collector factories must return Collector instances that extend > DelegatingCollector. > A sample analytic collector is provided in the patch through the > org.apache.solr.handler.component.SumCollectorFactory. > This collectorFactory provides a very simple DelegatingCollector that groups > by a field and sums a column of floats. The sum collector is not designed to > be a fully functional sum function but to be a proof of concept for pluggable > analytics through delegating collectors. > You can send parameters to analytic collectors with solr local param syntax. > For example: > cl.analytic=\{! id=1 groupby=field1 column=field2\}sum > The "id" parameter is mandatory for analytic collectors and is used to > identify the output from the collector. In this example the "groupby" and > "column" params tell the sum collector which field to group by and sum. > Analytic collectors are passed a reference to the ResponseBuilder and can > place maps with analytic output directory into the SolrQueryResponse with the > add() method. > Maps that are placed in the SolrQueryResponse are automatically added to the > outgoing response. The response will include a list named cl.analytic., > where id is specified in the local param. > *Distributed Search* > The CollectorFactory also has a method
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761604#comment-13761604 ] Kranti Parisa edited comment on SOLR-4465 at 9/9/13 4:27 AM: - Does any of those tickets support configurable collectors and choosing them dynamically thru request params? Is SOLR-5045 the one to use? was (Author: krantiparisa): Does any of those tickets support configurable collectors and choosing them dynamically thru request params? > Configurable Collectors > --- > > Key: SOLR-4465 > URL: https://issues.apache.org/jira/browse/SOLR-4465 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.1 >Reporter: Joel Bernstein > Fix For: 4.5, 5.0 > > Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch > > > This ticket provides a patch to add pluggable collectors to Solr. This patch > was generated and tested with Solr 4.1. > This is how the patch functions: > Collectors are plugged into Solr in the solconfig.xml using the new > collectorFactory element. For example: > > > The elements above define two collector factories. The first one is the > "default" collectorFactory. The class attribute points to > org.apache.solr.handler.component.CollectorFactory, which implements logic > that returns the default TopScoreDocCollector and TopFieldCollector. > To create your own collectorFactory you must subclass the default > CollectorFactory and at a minimum override the getCollector method to return > your new collector. > The parameter "cl" turns on pluggable collectors: > cl=true > If cl is not in the parameters, Solr will automatically use the default > collectorFactory. > *Pluggable Doclist Sorting With the Docs Collector* > You can specify two types of pluggable collectors. The first type is the docs > collector. For example: > cl.docs= > The above param points to a named collectorFactory in the solrconfig.xml to > construct the collector. The docs collectorFactorys must return a collector > that extends the TopDocsCollector base class. Docs collectors are responsible > for collecting the doclist. > You can specify only one docs collector per query. > You can pass parameters to the docs collector using local params syntax. For > example: > cl.docs=\{! sort=mycustomesort\}mycollector > If cl=true and a docs collector is not specified, Solr will use the default > collectorFactory to create the docs collector. > *Pluggable Custom Analytics With Delegating Collectors* > You can also specify any number of custom analytic collectors with the > "cl.analytic" parameter. Analytic collectors are designed to collect > something else besides the doclist. Typically this would be some type of > custom analytic. For example: > cl.analytic=sum > The parameter above specifies a analytic collector named sum. Like the docs > collectors, "sum" points to a named collectorFactory in the solrconfig.xml. > You can specificy any number of analytic collectors by adding additional > cl.analytic parameters. > Analytic collector factories must return Collector instances that extend > DelegatingCollector. > A sample analytic collector is provided in the patch through the > org.apache.solr.handler.component.SumCollectorFactory. > This collectorFactory provides a very simple DelegatingCollector that groups > by a field and sums a column of floats. The sum collector is not designed to > be a fully functional sum function but to be a proof of concept for pluggable > analytics through delegating collectors. > You can send parameters to analytic collectors with solr local param syntax. > For example: > cl.analytic=\{! id=1 groupby=field1 column=field2\}sum > The "id" parameter is mandatory for analytic collectors and is used to > identify the output from the collector. In this example the "groupby" and > "column" params tell the sum collector which field to group by and sum. > Analytic collectors are passed a reference to the ResponseBuilder and can > place maps with analytic output directory into the SolrQueryResponse with the > add() method. > Maps that are placed in the SolrQueryResponse are automatically added to the > outgoing response. The response will include a list named cl.analytic., > where id is specified in the local param. > *Distributed Search* > The CollectorFactory also has a method called merge(). This method aggregates > the results from each of the shards during distributed search. The "default" > CollectoryFactory
[jira] [Commented] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761604#comment-13761604 ] Kranti Parisa commented on SOLR-4465: - Does any of those tickets support configurable collectors and choosing them dynamically thru request params? > Configurable Collectors > --- > > Key: SOLR-4465 > URL: https://issues.apache.org/jira/browse/SOLR-4465 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 4.1 >Reporter: Joel Bernstein > Fix For: 4.5, 5.0 > > Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, > SOLR-4465.patch, SOLR-4465.patch > > > This ticket provides a patch to add pluggable collectors to Solr. This patch > was generated and tested with Solr 4.1. > This is how the patch functions: > Collectors are plugged into Solr in the solconfig.xml using the new > collectorFactory element. For example: > > > The elements above define two collector factories. The first one is the > "default" collectorFactory. The class attribute points to > org.apache.solr.handler.component.CollectorFactory, which implements logic > that returns the default TopScoreDocCollector and TopFieldCollector. > To create your own collectorFactory you must subclass the default > CollectorFactory and at a minimum override the getCollector method to return > your new collector. > The parameter "cl" turns on pluggable collectors: > cl=true > If cl is not in the parameters, Solr will automatically use the default > collectorFactory. > *Pluggable Doclist Sorting With the Docs Collector* > You can specify two types of pluggable collectors. The first type is the docs > collector. For example: > cl.docs= > The above param points to a named collectorFactory in the solrconfig.xml to > construct the collector. The docs collectorFactorys must return a collector > that extends the TopDocsCollector base class. Docs collectors are responsible > for collecting the doclist. > You can specify only one docs collector per query. > You can pass parameters to the docs collector using local params syntax. For > example: > cl.docs=\{! sort=mycustomesort\}mycollector > If cl=true and a docs collector is not specified, Solr will use the default > collectorFactory to create the docs collector. > *Pluggable Custom Analytics With Delegating Collectors* > You can also specify any number of custom analytic collectors with the > "cl.analytic" parameter. Analytic collectors are designed to collect > something else besides the doclist. Typically this would be some type of > custom analytic. For example: > cl.analytic=sum > The parameter above specifies a analytic collector named sum. Like the docs > collectors, "sum" points to a named collectorFactory in the solrconfig.xml. > You can specificy any number of analytic collectors by adding additional > cl.analytic parameters. > Analytic collector factories must return Collector instances that extend > DelegatingCollector. > A sample analytic collector is provided in the patch through the > org.apache.solr.handler.component.SumCollectorFactory. > This collectorFactory provides a very simple DelegatingCollector that groups > by a field and sums a column of floats. The sum collector is not designed to > be a fully functional sum function but to be a proof of concept for pluggable > analytics through delegating collectors. > You can send parameters to analytic collectors with solr local param syntax. > For example: > cl.analytic=\{! id=1 groupby=field1 column=field2\}sum > The "id" parameter is mandatory for analytic collectors and is used to > identify the output from the collector. In this example the "groupby" and > "column" params tell the sum collector which field to group by and sum. > Analytic collectors are passed a reference to the ResponseBuilder and can > place maps with analytic output directory into the SolrQueryResponse with the > add() method. > Maps that are placed in the SolrQueryResponse are automatically added to the > outgoing response. The response will include a list named cl.analytic., > where id is specified in the local param. > *Distributed Search* > The CollectorFactory also has a method called merge(). This method aggregates > the results from each of the shards during distributed search. The "default" > CollectoryFactory implements the default merge logic for merging documents > from each shard. If you define a different docs collector you can override > the default merge method to merge documents in accordance with how they are > collected at the shard lev
[jira] [Assigned] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-5215: - Assignee: Mark Miller > Deadlock in Solr Cloud ConnectionManager > > > Key: SOLR-5215 > URL: https://issues.apache.org/jira/browse/SOLR-5215 > Project: Solr > Issue Type: Bug > Components: clients - java, SolrCloud >Affects Versions: 4.2.1 > Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 > x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_18" > Java(TM) SE Runtime Environment (build 1.6.0_18-b07) > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) >Reporter: Ricardo Merizalde >Assignee: Mark Miller > > We are constantly seeing a deadlocks in our production application servers. > The problem seems to be that a thread A: > - tries to process an event and acquires the ConnectionManager lock > - the update callback acquires connectionUpdateLock and invokes > waitForConnected > - waitForConnected tries to acquire the ConnectionManager lock (which already > has) > - waitForConnected calls wait and release the ConnectionManager lock (but > still has the connectionUpdateLock) > The a thread B: > - tries to process an event and acquires the ConnectionManager lock > - the update call back tries to acquire connectionUpdateLock but gets blocked > holding the ConnectionManager lock and preventing thread A from getting out > of the wait state. > > Here is part of the thread dump: > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x59965800 > nid=0x3e81 waiting for monitor entry [0x57169000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) > - waiting to lock <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x5ad4 > nid=0x3e67 waiting for monitor entry [0x4dbd4000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - waiting to lock <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x2aac4c2f7000 > nid=0x3d9a waiting for monitor entry [0x42821000] >java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - locked <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > Found one Java-level deadlock: > = > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.ConnectionManager), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a > java.lang.Object), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.ConnectionManager), > which is held by "http-0
[jira] [Updated] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5215: -- Fix Version/s: 5.0 4.5 > Deadlock in Solr Cloud ConnectionManager > > > Key: SOLR-5215 > URL: https://issues.apache.org/jira/browse/SOLR-5215 > Project: Solr > Issue Type: Bug > Components: clients - java, SolrCloud >Affects Versions: 4.2.1 > Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 > x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_18" > Java(TM) SE Runtime Environment (build 1.6.0_18-b07) > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) >Reporter: Ricardo Merizalde >Assignee: Mark Miller > Fix For: 4.5, 5.0 > > > We are constantly seeing a deadlocks in our production application servers. > The problem seems to be that a thread A: > - tries to process an event and acquires the ConnectionManager lock > - the update callback acquires connectionUpdateLock and invokes > waitForConnected > - waitForConnected tries to acquire the ConnectionManager lock (which already > has) > - waitForConnected calls wait and release the ConnectionManager lock (but > still has the connectionUpdateLock) > The a thread B: > - tries to process an event and acquires the ConnectionManager lock > - the update call back tries to acquire connectionUpdateLock but gets blocked > holding the ConnectionManager lock and preventing thread A from getting out > of the wait state. > > Here is part of the thread dump: > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x59965800 > nid=0x3e81 waiting for monitor entry [0x57169000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) > - waiting to lock <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x5ad4 > nid=0x3e67 waiting for monitor entry [0x4dbd4000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - waiting to lock <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > "http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x2aac4c2f7000 > nid=0x3d9a waiting for monitor entry [0x42821000] >java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) > - locked <0x2aab1b0e0f78> (a java.lang.Object) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) > - locked <0x2aab1b0e0ce0> (a > org.apache.solr.common.cloud.ConnectionManager) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > > Found one Java-level deadlock: > = > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.ConnectionManager), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a > java.lang.Object), > which is held by "http-0.0.0.0-8080-82-EventThread" > "http-0.0.0.0-8080-82-EventThread": > waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a > org.apache.solr.common.cloud.
[jira] [Created] (SOLR-5221) CloudSolrServer should default to 15 seconds for the zk client timeout, just like Solr core does.
Mark Miller created SOLR-5221: - Summary: CloudSolrServer should default to 15 seconds for the zk client timeout, just like Solr core does. Key: SOLR-5221 URL: https://issues.apache.org/jira/browse/SOLR-5221 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.5, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5221) CloudSolrServer should default to 15 seconds for the zk client timeout, just like Solr core does.
[ https://issues.apache.org/jira/browse/SOLR-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761592#comment-13761592 ] Mark Miller commented on SOLR-5221: --- It currently defaults to 10 - the old core default. > CloudSolrServer should default to 15 seconds for the zk client timeout, just > like Solr core does. > - > > Key: SOLR-5221 > URL: https://issues.apache.org/jira/browse/SOLR-5221 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: A reference to a commercial algorithm in comments - is this all right?
I see no problem with it. - Mark On Sun, Sep 8, 2013 at 2:50 PM, Dawid Weiss wrote: > As part of a recent commit I cleaned up the comments surrounding the > clustering extension in the Solr example. As part of this I added > comments concerning configuration of clustering algorithms in the > Carrot2 framework, but also a helpers that refer to our commercial > clustering algorithm Lingo3G. They seem harmless to me, as in: > > >name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm > > I admit the reason I included these was not to promote the algorithm, > but to limit the number of support requests we get where users are not > sure how to modify Solr configuration to use Lingo3G out of the box... > > Is this something that is ok or does it bother anybody? If so, let me > know and I will remove those two references from comments. > > Dawid > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- - Mark
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761550#comment-13761550 ] Mark Miller commented on SOLR-4816: --- bq. this "high priority" ... Jira ... is still listed as "Minor". My personal priority list has nothing to do with the severity in JIRA for this issue. I'm assigned and working on this - surprising or not. I have stated that this is an important issue that is on the road map and that it is high priority for me to get into 4.5. Nothing has changed. > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761545#comment-13761545 ] Benson Margulies commented on LUCENE-5202: -- Well, it only took me about 10 minutes to code a class that did what I needed once you goosed me into coding it. I suspect that there's something that LTF does that I _don't_ need that explains why it is so complex. The rolling buffer suggests to me that it's supporting some much more flexible idea about lookahead than just 'grab a batch, process them, regurgitate the results (including extra tokens), grab the next batch.' Or in other words, since there are analyzers in Lucene that are still using pre-AttributeSource methods to handle creating additional tokens, one would think that there would be a use for a base class that could support them easily. in any case, you're welcome. > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761544#comment-13761544 ] Michael McCandless commented on LUCENE-5202: OK I'll commit this fix ... thanks for iterating here :) If you have any ideas on how to make LookaheadTF more useful please keep raising them! > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5123) invert the codec postings API
[ https://issues.apache.org/jira/browse/LUCENE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761540#comment-13761540 ] Michael McCandless commented on LUCENE-5123: {quote} 1. move write() from PostingsFormat to FieldsConsumer 2. make the "push" api a subclass of FieldsConsumer that has a final implementation of write() and exposes the abstract api it has today (e.g. addField) {quote} I started down this path (moved the write method to FieldsConsumer, and created a PushFieldsConsumer subclass that impls final write, exposing the current API) but ... this causes problems for wrapping/delegating PostingsConsumers (e.g. AssertingPF, BloomPF, PulsingPF) since suddenly they must be strongly typed to accept only PushFieldsConsumer. Either that or I guess we could cut each of these over to write(). I mean, it exposes a real issue w/ the current patch: you cannot wrap SimpleTextPF (or any future PF that uses the pull API) inside these PFs that use the push API. Not sure what to do ... > invert the codec postings API > - > > Key: LUCENE-5123 > URL: https://issues.apache.org/jira/browse/LUCENE-5123 > Project: Lucene - Core > Issue Type: Wish >Reporter: Robert Muir >Assignee: Michael McCandless > Attachments: LUCENE-5123.patch, LUCENE-5123.patch, LUCENE-5123.patch > > > Currently FieldsConsumer/PostingsConsumer/etc is a "push" oriented api, e.g. > FreqProxTermsWriter streams the postings at flush, and the default merge() > takes the incoming codec api and filters out deleted docs and "pushes" via > same api (but that can be overridden). > It could be cleaner if we allowed for a "pull" model instead (like > DocValues). For example, maybe FreqProxTermsWriter could expose a Terms of > itself and just passed this to the codec consumer. > This would give the codec more flexibility to e.g. do multiple passes if it > wanted to do things like encode high-frequency terms more efficiently with a > bitset-like encoding or other things... > A codec can try to do things like this to some extent today, but its very > difficult (look at buffering in Pulsing). We made this change with DV and it > made a lot of interesting optimizations easy to implement... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: A reference to a commercial algorithm in comments - is this all right?
I think less configuration there is an improvement :) The problem seems to be that "carrot.algorithm" can be any clustering component that plugs into the Carrot2 framework -- including our commercial algorithm (I don't think there's anything else besides that). We suck at brand differentiation and many of our customer have found it difficult to tell the difference between Carrot2, Lingo, Lingo3G and where to put the configuration bits and pieces in Solr code. So while it doesn't harm any users of the open source algorithms it helps those already using (or willing to try) the commercial algorithm to locate the relevant bits. Dawid On Sun, Sep 8, 2013 at 8:57 PM, Simon Willnauer wrote: > I don't think it's an issue - if it helps users to conclude how to get > it I think it's actually an improvement! > > simon > > On Sun, Sep 8, 2013 at 8:50 PM, Dawid Weiss > wrote: >> As part of a recent commit I cleaned up the comments surrounding the >> clustering extension in the Solr example. As part of this I added >> comments concerning configuration of clustering algorithms in the >> Carrot2 framework, but also a helpers that refer to our commercial >> clustering algorithm Lingo3G. They seem harmless to me, as in: >> >> >> > name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm >> >> I admit the reason I included these was not to promote the algorithm, >> but to limit the number of support requests we get where users are not >> sure how to modify Solr configuration to use Lingo3G out of the box... >> >> Is this something that is ok or does it bother anybody? If so, let me >> know and I will remove those two references from comments. >> >> Dawid >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: A reference to a commercial algorithm in comments - is this all right?
I don't think it's an issue - if it helps users to conclude how to get it I think it's actually an improvement! simon On Sun, Sep 8, 2013 at 8:50 PM, Dawid Weiss wrote: > As part of a recent commit I cleaned up the comments surrounding the > clustering extension in the Solr example. As part of this I added > comments concerning configuration of clustering algorithms in the > Carrot2 framework, but also a helpers that refer to our commercial > clustering algorithm Lingo3G. They seem harmless to me, as in: > > >name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm > > I admit the reason I included these was not to promote the algorithm, > but to limit the number of support requests we get where users are not > sure how to modify Solr configuration to use Lingo3G out of the box... > > Is this something that is ok or does it bother anybody? If so, let me > know and I will remove those two references from comments. > > Dawid > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
A reference to a commercial algorithm in comments - is this all right?
As part of a recent commit I cleaned up the comments surrounding the clustering extension in the Solr example. As part of this I added comments concerning configuration of clustering algorithms in the Carrot2 framework, but also a helpers that refer to our commercial clustering algorithm Lingo3G. They seem harmless to me, as in: org.carrot2.clustering.lingo.LingoClusteringAlgorithm I admit the reason I included these was not to promote the algorithm, but to limit the number of support requests we get where users are not sure how to modify Solr configuration to use Lingo3G out of the box... Is this something that is ok or does it bother anybody? If so, let me know and I will remove those two references from comments. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761505#comment-13761505 ] Hardik Upadhyay commented on SOLR-5217: --- CachedSqlEntityProcessor should take in consideration where clauses in case of sql query and parameters passed in case of stored procedure. > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761503#comment-13761503 ] Hardik Upadhyay commented on SOLR-5217: --- Yes,over the iteration on parent entity, child entity's parametrized stored procedure params are changing but,CachedSqlEntityProcessor returns same result.More over tracing DB calls it revels the fact that those SPs are called only once during the DIH run. > CachedSqlEntity fails with stored procedure > --- > > Key: SOLR-5217 > URL: https://issues.apache.org/jira/browse/SOLR-5217 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Reporter: Hardik Upadhyay > Attachments: db-data-config.xml > > > When using DIH with CachedSqlEntityProcessor and importing data from MS-sql > using stored procedures, it imports data for nested entities only once and > then every call with different arguments for nested entities are only served > from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer
[ https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761489#comment-13761489 ] Jack Krupansky commented on SOLR-4816: -- I was surprised to see that this "high priority" is still not committed for 4.5. Although, the actual Jira priority is still listed as "Minor". > Add document routing to CloudSolrServer > --- > > Key: SOLR-4816 > URL: https://issues.apache.org/jira/browse/SOLR-4816 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Joel Bernstein >Assignee: Mark Miller >Priority: Minor > Fix For: 4.5, 5.0 > > Attachments: SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, > SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch > > > This issue adds the following enhancements to CloudSolrServer's update logic: > 1) Document routing: Updates are routed directly to the correct shard leader > eliminating document routing at the server. > 2) Optional parallel update execution: Updates for each shard are executed in > a separate thread so parallel indexing can occur across the cluster. > These enhancements should allow for near linear scalability on indexing > throughput. > Usage: > CloudSolrServer cloudClient = new CloudSolrServer(zkAddress); > cloudClient.setParallelUpdates(true); > SolrInputDocument doc1 = new SolrInputDocument(); > doc1.addField(id, "0"); > doc1.addField("a_t", "hello1"); > SolrInputDocument doc2 = new SolrInputDocument(); > doc2.addField(id, "2"); > doc2.addField("a_t", "hello2"); > UpdateRequest request = new UpdateRequest(); > request.add(doc1); > request.add(doc2); > request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false); > NamedList response = cloudClient.request(request); // Returns a backwards > compatible condensed response. > //To get more detailed response down cast to RouteResponse: > CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
copy/paste typo in solr.cloud.Overseer.getShardNames exception
In org.apache.solr.cloud.Overseer.getShardNames of branch_4x, the second exception message is an exact copy of the first, but probably should be something like “shards param must specify at least one shard”: static void getShardNames(List shardNames, String shards) { if(shards ==null) throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "shards" + " is a required param"); for (String s : shards.split(",")) { if(s ==null || s.trim().isEmpty()) continue; shardNames.add(s.trim()); } if(shardNames.isEmpty()) throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "shards" + " is a required param"); } -- Jack Krupansky
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761475#comment-13761475 ] Benson Margulies commented on LUCENE-5202: -- OK, I see. So I'll leave it to you to apply this patch to pick up the fix you made. thanks > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761264#comment-13761264 ] Michael McCandless commented on LUCENE-5202: bq. There's a call to peekToken in nextToken used to detect the end of the input. When that gets called, a token 'moves' from the input to the positions, so the calls to peekToken in my code never see it. OK I think I see. So, your peekSentence has peek'd N tokens, up until it saw a '.' token. Then, your incrementToken does nextToken() to get through those buffered tokens, tweaking atts before returning, but then on the first nextToken() after the lookahead buffer is exhausted, peekToken() is called directly from nextToken() and you have no chance to intercept that. But note that this token doesn't actually move to positions (get buffered); it just "passes through", i.e. when nextToken returns the atts of that new token are "live" in the attributes and you could examine it "live". Or, maybe, you could use a counter, incremented as you peek tokens in peekSentence, and then decremented as you nextToken() off the lookahead, and once that reaches 0 you peekSentence() again? Or, maybe LookaheadTF should do this for you, e.g. provide a lookaheadCount saying how many tokens are in the lookahead buffer. Net/net, it may be a lot easier to just make your own dedicated class :) It would have direct control over the buffer, so you wouldn't have to deal with the confusing flow of LookaheadTF. > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 374 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/374/ 1 tests failed. REGRESSION: org.apache.lucene.index.Test2BPostings.test Error Message: GC overhead limit exceeded Stack Trace: java.lang.OutOfMemoryError: GC overhead limit exceeded at __randomizedtesting.SeedInfo.seed([3E09E7626DB890C6:B65DD8B8C344FD3E]:0) at org.apache.lucene.document.Document.storedFieldsIterator(Document.java:306) at org.apache.lucene.document.Document.access$100(Document.java:45) at org.apache.lucene.document.Document$2.iterator(Document.java:300) at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:194) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:254) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:446) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1519) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1189) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1170) at org.apache.lucene.index.Test2BPostings.test(Test2BPostings.java:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) Build Log: [...truncated 1108 lines...] [junit4] Suite: org.apache.lucene.index.Test2BPostings [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=Test2BPostings -Dtests.method=test -Dtests.seed=3E09E7626DB890C6 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt -Dtests.locale=sk -Dtests.timezone=Europe/Vatican -Dtests.file.encoding=ISO-8859-1 [junit4] ERROR171s J0 | Test2BPostings.test <<< [junit4]> Throwable #1: java.lang.OutOfMemoryError: GC overhead limit exceeded [junit4]>at __randomizedtesting.SeedInfo.seed([3E09E7626DB890C6:B65DD8B8C344FD3E]:0) [junit4]>at org.apache.lucene.document.Document.storedFieldsIterator(Document.java:306) [junit4]>at org.apache.lucene.document.Document.access$100(Document.java:45) [junit4]>at org.apache.lucene.document.Document$2.iterator(Document.java:300) [junit4]>at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:194) [junit4]>at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:254) [junit4]>at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:446) [junit4]>at org.apache.lucene.index.IndexWriter.updateDocument(Inde
[jira] [Commented] (SOLR-5201) UIMAUpdateRequestProcessor should reuse the AnalysisEngine
[ https://issues.apache.org/jira/browse/SOLR-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761260#comment-13761260 ] ASF subversion and git services commented on SOLR-5201: --- Commit 1520859 from [~teofili] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520859 ] SOLR-5201 - patch backported to branch_4x > UIMAUpdateRequestProcessor should reuse the AnalysisEngine > -- > > Key: SOLR-5201 > URL: https://issues.apache.org/jira/browse/SOLR-5201 > Project: Solr > Issue Type: Improvement > Components: contrib - UIMA >Affects Versions: 4.4 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili > Fix For: 4.5, 5.0 > > Attachments: SOLR-5201-ae-cache-every-request_branch_4x.patch, > SOLR-5201-ae-cache-only-single-request_branch_4x.patch > > > As reported in http://markmail.org/thread/2psiyl4ukaejl4fx > UIMAUpdateRequestProcessor instantiates an AnalysisEngine for each request > which is bad for performance therefore it'd be nice if such AEs could be > reused whenever that's possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5201) UIMAUpdateRequestProcessor should reuse the AnalysisEngine
[ https://issues.apache.org/jira/browse/SOLR-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761259#comment-13761259 ] Tommaso Teofili commented on SOLR-5201: --- ok good, thanks. I'll merge it to branch_4x too. > UIMAUpdateRequestProcessor should reuse the AnalysisEngine > -- > > Key: SOLR-5201 > URL: https://issues.apache.org/jira/browse/SOLR-5201 > Project: Solr > Issue Type: Improvement > Components: contrib - UIMA >Affects Versions: 4.4 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili > Fix For: 4.5, 5.0 > > Attachments: SOLR-5201-ae-cache-every-request_branch_4x.patch, > SOLR-5201-ae-cache-only-single-request_branch_4x.patch > > > As reported in http://markmail.org/thread/2psiyl4ukaejl4fx > UIMAUpdateRequestProcessor instantiates an AnalysisEngine for each request > which is bad for performance therefore it'd be nice if such AEs could be > reused whenever that's possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761254#comment-13761254 ] Benson Margulies commented on LUCENE-5202: -- Yes, that's what I have and it works, except for the problem I wrote this test case to demonstrate. There's a call to peekToken in nextToken used to detect the end of the input. When that gets called, a token 'moves' from the input to the positions, so the calls to peekToken in my code never see it. Either I'm supposed to call restoreState to examine it, or there's a problem here. If I'm supposed to call restoreState, I need to figure out how to notice (by looking at positions?) that I'm in that situation. Or there's some problem in my logic for deciding when to do my next load of peeks, so that nextToken is never supposed to reach that call to peek, but I can't figure out what it is. > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
[ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761246#comment-13761246 ] Michael McCandless commented on LUCENE-5202: Oh, sorry, I see; I indeed thought you were trying to create new tokens (and, changed the test to do so). OK, so for your first case (just changing attrs based on looked-ahead tokens), afterPosition is not the right place to do that: this method is effectively called after the last token leaving the current position has been emitted, and before setting attrs to the state for the next token. It's basically "between" tokens. If you just want to change the att values, I think you should do that in your incrementToken, i.e. it would first call nextToken(), and if that returned true, it would then futz w/ the attrs and return true. Would that work? > LookaheadTokenFilter consumes an extra token in nextToken > - > > Key: LUCENE-5202 > URL: https://issues.apache.org/jira/browse/LUCENE-5202 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 4.3.1 >Reporter: Benson Margulies > Attachments: LUCENE-5202.patch, LUCENE-5202.patch > > > This is a bit hard to explain except by looking at the test case. I've coded > a filter that uses LookaheadTokenFilter. The incrementToken method peeks some > tokens. Then, it seems, nextToken in the Lookahead class calls peekToken > itself, which seems to me to consume a token so that it's not seen when the > derived class sets out to process the next set of tokens. > In passing, this test case can be used to demonstrate that it does not work > to try to use the afterPosition method to set up attributes of the token that > we're 'after'. Probably that was never intended. However, I'm hoping for some > feedback as to whether the rest of the structure here is as intended for > subclasses of LookaheadTokenFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org