[jira] [Updated] (SOLR-5308) Split all documents of a route key into another collection

2013-10-30 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5308:


Attachment: SOLR-5308.patch

This patch adds request forwarding for delete by ID and query requests.

> Split all documents of a route key into another collection
> --
>
> Key: SOLR-5308
> URL: https://issues.apache.org/jira/browse/SOLR-5308
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5308.patch, SOLR-5308.patch, SOLR-5308.patch, 
> SOLR-5308.patch, SOLR-5308.patch, SOLR-5308.patch
>
>
> Enable SolrCloud users to split out a set of documents from a source 
> collection into another collection.
> This will be useful in multi-tenant environments. This feature will make it 
> possible to split a tenant out of a collection and put them into their own 
> collection which can be scaled separately.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5406) CloudSolrServer doesn't propagate request params on a delete

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809907#comment-13809907
 ] 

ASF subversion and git services commented on SOLR-5406:
---

Commit 1537375 from [~yo...@apache.org] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537375 ]

SOLR-5406: pass params with delete

> CloudSolrServer doesn't propagate request params on a delete
> 
>
> Key: SOLR-5406
> URL: https://issues.apache.org/jira/browse/SOLR-5406
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5406.patch
>
>
> It appears that deletes using CloudSolrServer drop request parems.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5406) CloudSolrServer doesn't propagate request params on a delete

2013-10-30 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-5406.


   Resolution: Fixed
Fix Version/s: 5.0
   4.6

> CloudSolrServer doesn't propagate request params on a delete
> 
>
> Key: SOLR-5406
> URL: https://issues.apache.org/jira/browse/SOLR-5406
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5406.patch
>
>
> It appears that deletes using CloudSolrServer drop request parems.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5406) CloudSolrServer doesn't propagate request params on a delete

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809905#comment-13809905
 ] 

ASF subversion and git services commented on SOLR-5406:
---

Commit 1537374 from [~yo...@apache.org] in branch 'dev/trunk'
[ https://svn.apache.org/r1537374 ]

SOLR-5406: pass params with delete

> CloudSolrServer doesn't propagate request params on a delete
> 
>
> Key: SOLR-5406
> URL: https://issues.apache.org/jira/browse/SOLR-5406
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Minor
> Attachments: SOLR-5406.patch
>
>
> It appears that deletes using CloudSolrServer drop request parems.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5084) new field type - EnumField

2013-10-30 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-5084:
-

Attachment: Solr-5084.trunk.patch

Fixes EOL warnings on OS X, passes precommit (there were a couple of Javadoc 
warnings), addes ASF header to one file that didn't have it.

However, tests don't seem to be passing, not sure whether this is a problem 
with this patch or not. The tests are distributed e.g. 
LeaderElectionIntegrationTest



> new field type - EnumField
> --
>
> Key: SOLR-5084
> URL: https://issues.apache.org/jira/browse/SOLR-5084
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Erick Erickson
> Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, 
> Solr-5084.patch, Solr-5084.patch, Solr-5084.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch, Solr-5084.trunk.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch, Solr-5084.trunk.patch
>
>
> We have encountered a use case in our system where we have a few fields 
> (Severity. Risk etc) with a closed set of values, where the sort order for 
> these values is pre-determined but not lexicographic (Critical is higher than 
> High). Generically this is very close to how enums work.
> To implement, I have prototyped a new type of field: EnumField where the 
> inputs are a closed predefined  set of strings in a special configuration 
> file (similar to currency.xml).
> The code is based on 4.2.1.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5402) SolrCloud 4.5 bulk add errors in cloud setup

2013-10-30 Thread Sai Gadde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809833#comment-13809833
 ] 

Sai Gadde commented on SOLR-5402:
-

I even tried adding 1 document at a time. Still the problem is present.

If there is only one server there no errors, everything works fine. updating 
even 500 documents is fine. 

But when another node comes online SolrCmdDistributor gets these exceptions 
from remote server response. New Solr node also prints it's own error stack as 
explained in the report above.

This exact same setup works without issues in 4.4.0. I tried with both 4.5.0 
and 4.5.1 and both produce same errors

> SolrCloud 4.5 bulk add errors in cloud setup
> 
>
> Key: SOLR-5402
> URL: https://issues.apache.org/jira/browse/SOLR-5402
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.5, 4.5.1
>Reporter: Sai Gadde
> Fix For: 4.6
>
>
> We use out of the box Solr 4.5.1 no customization done. If we merge documents 
> via SolrJ to a single server it is perfectly working fine.
> But as soon as we add another node to the cloud we are getting following 
> while merging documents. We merge about 500 at a time using SolrJ. These 500 
> documents in total are about few MB (1-3) in size.
> This is the error we are getting on the server (10.10.10.116 - IP is 
> irrelavent just for clarity)where merging is happening. 10.10.10.119 is the 
> new node here. This server gets RemoteSolrException
> shard update error StdNode: 
> http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
>  Illegal to have multiple roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> On the other server 10.10.10.119 we get following error
> org.apache.solr.common.SolrException: Illegal to have multiple roots (start 
> tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>   at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
>   at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>   at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>   at 
> java.util.co

[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809818#comment-13809818
 ] 

Nathan Neulinger commented on SOLR-5407:


Bigger concern than the initial failure is that the solr deployment sortof 
acted like everything was up, but was only partially working right. 

> Strange error condition with cloud replication not working quite right
> --
>
> Key: SOLR-5407
> URL: https://issues.apache.org/jira/browse/SOLR-5407
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>  Labels: cloud, replication
>
> I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK 
> nodes, and a pair of solr nodes.  I'll apologize in advance that this error 
> report is not going to have a lot of detail, I'm really hoping that the 
> scenario/description will trigger some "likely" possible explanation.
> The situation I got into was that the server had decided to fail over, so my 
> app servers were all taking to what should have been the primary for most of 
> the shards/collections, but actually was the replica.
> Here's where it gets odd - no errors being returned to the client code for 
> any of the searches or document updates - and the current primary server was 
> definitely receiving all of the updates - even though they were being 
> submitted to the inactive/replica node. (clients talking to solr-p1, which 
> was not primary at the time, and writes were being passed through to solr-r1, 
> which was primary at the time.)
> All sounds good so far right? Except - the replica server at the time, 
> through which the writes were passing - never got any of those content 
> updates. It had an old unmodified copy of the index. 
> I restarted solr-p1 (was the replica at the time) - no change in behavior. 
> Behavior did not change until I killed and restarted the current primary 
> (solr-r1) to force it to fail over.
> At that point, everything was all happy again and working properly. 
> Until this morning, when one of the developers provisioned a new collection, 
> which happened to put it's primary on solr-r1. Again, clients all pointing at 
> solr-p1. The developer reported that the documents were going into the index, 
> but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809802#comment-13809802
 ] 

Nathan Neulinger commented on SOLR-5407:


This also looks an awful lot like what we saw:

http://osdir.com/ml/solr-user.lucene.apache.org/2013-10/msg00673.html



> Strange error condition with cloud replication not working quite right
> --
>
> Key: SOLR-5407
> URL: https://issues.apache.org/jira/browse/SOLR-5407
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>  Labels: cloud, replication
>
> I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK 
> nodes, and a pair of solr nodes.  I'll apologize in advance that this error 
> report is not going to have a lot of detail, I'm really hoping that the 
> scenario/description will trigger some "likely" possible explanation.
> The situation I got into was that the server had decided to fail over, so my 
> app servers were all taking to what should have been the primary for most of 
> the shards/collections, but actually was the replica.
> Here's where it gets odd - no errors being returned to the client code for 
> any of the searches or document updates - and the current primary server was 
> definitely receiving all of the updates - even though they were being 
> submitted to the inactive/replica node. (clients talking to solr-p1, which 
> was not primary at the time, and writes were being passed through to solr-r1, 
> which was primary at the time.)
> All sounds good so far right? Except - the replica server at the time, 
> through which the writes were passing - never got any of those content 
> updates. It had an old unmodified copy of the index. 
> I restarted solr-p1 (was the replica at the time) - no change in behavior. 
> Behavior did not change until I killed and restarted the current primary 
> (solr-r1) to force it to fail over.
> At that point, everything was all happy again and working properly. 
> Until this morning, when one of the developers provisioned a new collection, 
> which happened to put it's primary on solr-r1. Again, clients all pointing at 
> solr-p1. The developer reported that the documents were going into the index, 
> but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809800#comment-13809800
 ] 

Nathan Neulinger commented on SOLR-5407:


Found this, will investigate further

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccabcj+++zzocam0edgv-3xwpfvrpfskoyazjt1xcqm2myht+...@mail.gmail.com%3E

talking about raising zookeeper session timeout. 

> Strange error condition with cloud replication not working quite right
> --
>
> Key: SOLR-5407
> URL: https://issues.apache.org/jira/browse/SOLR-5407
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>  Labels: cloud, replication
>
> I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK 
> nodes, and a pair of solr nodes.  I'll apologize in advance that this error 
> report is not going to have a lot of detail, I'm really hoping that the 
> scenario/description will trigger some "likely" possible explanation.
> The situation I got into was that the server had decided to fail over, so my 
> app servers were all taking to what should have been the primary for most of 
> the shards/collections, but actually was the replica.
> Here's where it gets odd - no errors being returned to the client code for 
> any of the searches or document updates - and the current primary server was 
> definitely receiving all of the updates - even though they were being 
> submitted to the inactive/replica node. (clients talking to solr-p1, which 
> was not primary at the time, and writes were being passed through to solr-r1, 
> which was primary at the time.)
> All sounds good so far right? Except - the replica server at the time, 
> through which the writes were passing - never got any of those content 
> updates. It had an old unmodified copy of the index. 
> I restarted solr-p1 (was the replica at the time) - no change in behavior. 
> Behavior did not change until I killed and restarted the current primary 
> (solr-r1) to force it to fail over.
> At that point, everything was all happy again and working properly. 
> Until this morning, when one of the developers provisioned a new collection, 
> which happened to put it's primary on solr-r1. Again, clients all pointing at 
> solr-p1. The developer reported that the documents were going into the index, 
> but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809799#comment-13809799
 ] 

Nathan Neulinger commented on SOLR-5407:


Digging further, it looks like it all keys around some sort of communications 
problem with zookeeper - looks like it all started at the end of this log 
snippet below (reverse time order) when it's reporting that 'Our previous 
ZooKeeper session was expired. Attempting to reconnect to recover relationship 
with ZooKeeper'. 



2013-10-29T16:25:50.344ZGoing to wait for coreNodeName: core_node2, 
state: down, checkLive: null, onlyIfLeader: null
2013-10-29T16:25:50.329Zpublishing 
core=myappqa-master_v8_shard1_replica1 state=down
2013-10-29T16:25:50.329ZCreating new http client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
2013-10-29T16:25:50.328ZWaited coreNodeName: core_node1, state: down, 
checkLive: null, onlyIfLeader: null for: 1 seconds.
2013-10-29T16:25:49.884ZA cluster state change: WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred 
- updating... (live nodes size: 1)
2013-10-29T16:25:49.825ZUpdating cloud state from ZooKeeper...
2013-10-29T16:25:49.825ZUpdate state numShards=1 message={
"operation":"state",
"state":"down",
"base_url":"http://10.170.2.54:8983/solr";,
"core":"hiv...
2013-10-29T16:25:49.324ZGoing to wait for coreNodeName: core_node1, 
state: down, checkLive: null, onlyIfLeader: null
2013-10-29T16:25:49.309ZCreating new http client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
2013-10-29T16:25:49.308Zpublishing 
core=myappqa-master_v6_shard1_replica2 state=down
2013-10-29T16:25:49.308ZWaited coreNodeName: core_node1, state: down, 
checkLive: null, onlyIfLeader: null for: 2 seconds.
2013-10-29T16:25:48.302ZA cluster state change: WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred 
- updating... (live nodes size: 1)
2013-10-29T16:25:48.239ZUpdating cloud state from ZooKeeper...
2013-10-29T16:25:48.239ZUpdate state numShards=1 message={
"operation":"state",
"state":"down",
"base_url":"http://10.170.2.54:8983/solr";,
"core":"hiv...
2013-10-29T16:25:47.304ZGoing to wait for coreNodeName: core_node1, 
state: down, checkLive: null, onlyIfLeader: null
2013-10-29T16:25:47.289ZCreating new http client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
2013-10-29T16:25:47.289Zpublishing 
core=myappstaging-profile_v7_shard1_replica1 state=down
2013-10-29T16:25:47.287ZWaited coreNodeName: core_node2, state: down, 
checkLive: null, onlyIfLeader: null for: 2 seconds.
2013-10-29T16:25:46.469ZA cluster state change: WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred 
- updating... (live nodes size: 1)
2013-10-29T16:25:46.406ZUpdate state numShards=1 message={
"operation":"state",
"state":"down",
"base_url":"http://10.170.2.54:8983/solr";,
"core":"hiv...
2013-10-29T16:25:45.925ZUpdating cloud state from ZooKeeper...
2013-10-29T16:25:45.286ZGoing to wait for coreNodeName: core_node2, 
state: down, checkLive: null, onlyIfLeader: null
2013-10-29T16:25:45.270Zpublishing 
core=myappstaging-profile_v8_shard1_replica1 state=down
2013-10-29T16:25:45.270ZCreating new http client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
2013-10-29T16:25:45.269ZWaited coreNodeName: core_node2, state: down, 
checkLive: null, onlyIfLeader: null for: 2 seconds.
2013-10-29T16:25:45.039ZA cluster state change: WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred 
- updating... (live nodes size: 1)
2013-10-29T16:25:44.994ZmakePath: 
/collections/myappproduction-production_v8/leaders/shard1
2013-10-29T16:25:44.994ZI am the new leader: 
http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/ 
shard1
2013-10-29T16:25:44.994Z
http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/ has 
no replicas
2013-10-29T16:25:44.991ZSync replicas to 
http://10.136.6.24:8983/solr/myappproduction-production_v8_shard1_replica1/
2013-10-29T16:25:44.991ZMy last published State was Active, it's okay 
to be the leader.
2013-10-29T16:25:44.991ZRunning the leader process for shard shard1
2013-10-29T16:25:44.991ZI may be the new leader - try and sync
2013-10-29T16:25:44.991ZSync Success - now sync replicas to me
2013-10-29T16:25:44.991ZChecking if I should try and be the leader.
2013-10-29T16:25:44.940ZI am the new leader: 
http://10.136.6.24:8983/solr/myappstaging-feature-completion_v9_shard1_replica1/
 shard1
2

[jira] [Updated] (SOLR-5406) CloudSolrServer doesn't propagate request params on a delete

2013-10-30 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-5406:
---

Attachment: SOLR-5406.patch

Attaching patch that should fix the issue.

> CloudSolrServer doesn't propagate request params on a delete
> 
>
> Key: SOLR-5406
> URL: https://issues.apache.org/jira/browse/SOLR-5406
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Minor
> Attachments: SOLR-5406.patch
>
>
> It appears that deletes using CloudSolrServer drop request parems.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809770#comment-13809770
 ] 

Nathan Neulinger commented on SOLR-5405:


Response:

I haven't.  However, it's notable that I've come to realize that I tend to just 
tune out information carried in color.  There are some good resources around 
(such as 
http://www.mollietaylor.com/2012/10/color-blindness-and-palette-choice.html and 
http://colorschemedesigner.com) on how to choose colors/shades in a way that 
avoids problems for colorblind folks like me.  But I actually like using color 
in combination with a secondary mechanism (like the X you mentioned) which I 
think works really well.

Just FYI, my biggest problems are when you have certain color/shade 
combinations together.  For example, dark green is hard to differentiate from 
brown and red.  Light green is hard to distinguish from yellow.  And medium 
green is hard to separate from orange.   


On Wed, Oct 30, 2013 at 4:33 PM, Nathan Neulinger  wrote:

Noticed anyplace else in the UI where colors are being used for information 
content that isn't otherwise represented?

-- Nathan


> Cloud graph view not usable by color-blind users - request small tweak
> --
>
> Key: SOLR-5405
> URL: https://issues.apache.org/jira/browse/SOLR-5405
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>Assignee: Stefan Matheis (steffkes)
>  Labels: accessibility
>
> Currently, the cloud view status is impossible to see easily on the graph 
> screen if you are color blind. (On of my coworkers.)
> Would it be possible to put " (X)" after the IP of the node where X is 
> [LARDFG] for the states?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809767#comment-13809767
 ] 

Nathan Neulinger commented on SOLR-5407:


The only error we could find in the logs was this:

09:08:01WARNPeerSyncno frame of reference to tell if we've 
missed updates
09:25:49WARNOverseer
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:25:49WARNOverseerCollectionProcessor Overseer cannot talk to 
ZK
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:25:49ERROR   SolrDispatchFilter  
null:org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /aliases.json
09:26:37WARNPeerSyncno frame of reference to tell if we've 
missed updates



> Strange error condition with cloud replication not working quite right
> --
>
> Key: SOLR-5407
> URL: https://issues.apache.org/jira/browse/SOLR-5407
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>  Labels: cloud, replication
>
> I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK 
> nodes, and a pair of solr nodes.  I'll apologize in advance that this error 
> report is not going to have a lot of detail, I'm really hoping that the 
> scenario/description will trigger some "likely" possible explanation.
> The situation I got into was that the server had decided to fail over, so my 
> app servers were all taking to what should have been the primary for most of 
> the shards/collections, but actually was the replica.
> Here's where it gets odd - no errors being returned to the client code for 
> any of the searches or document updates - and the current primary server was 
> definitely receiving all of the updates - even though they were being 
> submitted to the inactive/replica node. (clients talking to solr-p1, which 
> was not primary at the time, and writes were being passed through to solr-r1, 
> which was primary at the time.)
> All sounds good so far right? Except - the replica server at the time, 
> through which the writes were passing - never got any of those content 
> updates. It had an old unmodified copy of the index. 
> I restarted solr-p1 (was the replica at the time) - no change in behavior. 
> Behavior did not change until I killed and restarted the current primary 
> (solr-r1) to force it to fail over.
> At that point, everything was all happy again and working properly. 
> Until this morning, when one of the developers provisioned a new collection, 
> which happened to put it's primary on solr-r1. Again, clients all pointing at 
> solr-p1. The developer reported that the documents were going into the index, 
> but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 421 - Still Failing

2013-10-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/421/

All tests passed

Build Log:
[...truncated 1548 lines...]
   [junit4] JVM J0: stdout was not empty, see: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build/core/test/temp/junit4-J0-20131030_231104_462.sysout
   [junit4] >>> JVM J0: stdout (verbatim) 
   [junit4] java.lang.OutOfMemoryError: Java heap space
   [junit4] Dumping heap to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/heapdumps/java_pid30660.hprof
 ...
   [junit4] Heap dump file created [427108390 bytes in 0.726 secs]
   [junit4] <<< JVM J0: EOF 

   [junit4] JVM J0: stderr was not empty, see: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build/core/test/temp/junit4-J0-20131030_231104_462.syserr
   [junit4] >>> JVM J0: stderr (verbatim) 
   [junit4] WARN: Unhandled exception in event serialization. -> 
java.lang.OutOfMemoryError: GC overhead limit exceeded (stack unavailable; OOM)
   [junit4] WARN: Event serializer exception. -> java.lang.OutOfMemoryError: GC 
overhead limit exceeded (stack unavailable; OOM)
   [junit4] WARN: Event serializer exception. -> java.io.IOException: 
Serializer already closed.
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.events.Serializer.serialize(Serializer.java:41)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.RunListenerEmitter.testFinished(RunListenerEmitter.java:113)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.NoExceptionRunListenerDecorator.testFinished(NoExceptionRunListenerDecorator.java:47)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.BeforeAfterRunListenerDecorator.testFinished(BeforeAfterRunListenerDecorator.java:51)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.OrderedRunNotifier$7.notifyListener(OrderedRunNotifier.java:179)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.OrderedRunNotifier$SafeNotifier.run(OrderedRunNotifier.java:63)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.OrderedRunNotifier.fireTestFinished(OrderedRunNotifier.java:176)
   [junit4] at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$SubNotifier.fireTestFinished(ThreadLeakControl.java:197)
   [junit4] at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:755)
   [junit4] at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
   [junit4] at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
   [junit4] at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
   [junit4] at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   [junit4] at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
   [junit4] at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   [junit4] at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   [junit4] at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   [junit4] at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4] at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
   [junit4] at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   [junit4] at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
   [junit4] at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
   [junit4] at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4] at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
   [junit4] at java.lang.Thread.run(Thread.java:679)
   [junit4] 
   [junit4] WARN: Event serializer exception. -> java.io.IOException: 
Serializer already closed.
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.events.Serializer.serialize(Serializer.java:41)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.RunListenerEmitter.testRunFinished(RunListenerEmitter.java:120)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.NoExceptionRunListenerDecorator.testRunFinished(NoExceptionRunListenerDecorator.java:31)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.BeforeAfterRunListenerDecorator.testRunFinished(BeforeAfterRunListenerDecorator.java:33)
   [junit4] at 
com.carrotsearch.ant.tasks.junit4.slave.OrderedRunNotifier$2.notifyListener(OrderedRunNotifier.java:9

[jira] [Created] (SOLR-5407) Strange error condition with cloud replication not working quite right

2013-10-30 Thread Nathan Neulinger (JIRA)
Nathan Neulinger created SOLR-5407:
--

 Summary: Strange error condition with cloud replication not 
working quite right
 Key: SOLR-5407
 URL: https://issues.apache.org/jira/browse/SOLR-5407
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger


I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK nodes, 
and a pair of solr nodes.  I'll apologize in advance that this error report is 
not going to have a lot of detail, I'm really hoping that the 
scenario/description will trigger some "likely" possible explanation.

The situation I got into was that the server had decided to fail over, so my 
app servers were all taking to what should have been the primary for most of 
the shards/collections, but actually was the replica.

Here's where it gets odd - no errors being returned to the client code for any 
of the searches or document updates - and the current primary server was 
definitely receiving all of the updates - even though they were being submitted 
to the inactive/replica node. (clients talking to solr-p1, which was not 
primary at the time, and writes were being passed through to solr-r1, which was 
primary at the time.)

All sounds good so far right? Except - the replica server at the time, through 
which the writes were passing - never got any of those content updates. It had 
an old unmodified copy of the index. 

I restarted solr-p1 (was the replica at the time) - no change in behavior. 
Behavior did not change until I killed and restarted the current primary 
(solr-r1) to force it to fail over.

At that point, everything was all happy again and working properly. 

Until this morning, when one of the developers provisioned a new collection, 
which happened to put it's primary on solr-r1. Again, clients all pointing at 
solr-p1. The developer reported that the documents were going into the index, 
but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak

2013-10-30 Thread Nathan Neulinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809762#comment-13809762
 ] 

Nathan Neulinger commented on SOLR-5405:


I passed inquiry along, and will be sure to submit any other easy tweaks to 
make it more accessible. 

> Cloud graph view not usable by color-blind users - request small tweak
> --
>
> Key: SOLR-5405
> URL: https://issues.apache.org/jira/browse/SOLR-5405
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>Assignee: Stefan Matheis (steffkes)
>  Labels: accessibility
>
> Currently, the cloud view status is impossible to see easily on the graph 
> screen if you are color blind. (On of my coworkers.)
> Would it be possible to put " (X)" after the IP of the node where X is 
> [LARDFG] for the states?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak

2013-10-30 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-5405:


Component/s: web gui
   Assignee: Stefan Matheis (steffkes)

Nathan, if it's literally _that_ easy to make the UI work for color blinds .. 
definitely doing this :)

I guess we could use colors which are more different than they are right now, 
but adding a mark like the one you suggest is fine too.

Any other suggestions? Happy to take them in .. the only thing i know (from 
hearing) is the typical problem with green/red .. but that's pretty much it, so 
input on that very much welcome :)

> Cloud graph view not usable by color-blind users - request small tweak
> --
>
> Key: SOLR-5405
> URL: https://issues.apache.org/jira/browse/SOLR-5405
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.5
>Reporter: Nathan Neulinger
>Assignee: Stefan Matheis (steffkes)
>  Labels: accessibility
>
> Currently, the cloud view status is impossible to see easily on the graph 
> screen if you are color blind. (On of my coworkers.)
> Would it be possible to put " (X)" after the IP of the node where X is 
> [LARDFG] for the states?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Hadoop dependencies

2013-10-30 Thread Steve Rowe
Petar,

The Lucene/Solr build system uses Ant+Ivy, not Maven, and we turn off 
transitive dependency resolution in Ivy, so the “extra” deps in the Hadoop POM 
don’t affect us.

Lucene/Solr releases include POMs with their Maven artifacts, and those POMs 
exclude transitive dependencies that we don’t depend on in the Ivy config.  Up 
to this point synchronizing the two has been a manual process, but I’m writing 
an Ant task to fully automate this setup: 
.

Steve

On Oct 30, 2013, at 6:24 PM, Petar Tahchiev  wrote:

> Cool,
> 
> I hope they can clean 2.3 before you release 4.6 :)
> 
> 
> 2013/10/31 Steve Rowe 
> Hi Petar,
> 
> This is already done, and will be included in the 4.6 release: 
> 
> 
> Steve
> 
> On Oct 30, 2013, at 6:12 PM, Petar Tahchiev  wrote:
> 
> > Hi guys,
> >
> > we're having a little chat on the Hadoop mailing list, why is Solr 
> > depending on hadoop 2.0.5-alpha:
> >
> > http://lucene.472066.n3.nabble.com/Question-on-hadoop-dependencies-td4098284.html
> >
> > Seems like the hadoop poms are bloated with unnecessary dependencies, so 
> > they promised to fix in the 2.3. If you migrate to 2.2 now it would be a 
> > lot easier to migrate to 2.3 when they are ready.
> >
> > Should I open a JIRA for that?
> >
> > --
> > Regards, Petar!
> > Karlovo, Bulgaria.
> > ---
> > Public PGP Key at: 
> > https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
> > Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> 
> -- 
> Regards, Petar!
> Karlovo, Bulgaria.
> ---
> Public PGP Key at: 
> https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
> Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules

2013-10-30 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809714#comment-13809714
 ] 

Yonik Seeley commented on SOLR-5374:


Linking to SOLR-5406, which hopefully is the only issue stopping this from 
fully working.

> Support user configured doc-centric versioning rules
> 
>
> Key: SOLR-5374
> URL: https://issues.apache.org/jira/browse/SOLR-5374
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch
>
>
> The existing optimistic concurrency features of Solr can be very handy for 
> ensuring that you are only updating/replacing the version of the doc you 
> think you are updating/replacing, w/o the risk of someone else 
> adding/removing the doc in the mean time -- but I've recently encountered 
> some situations where I really wanted to be able to let the client specify an 
> arbitrary version, on a per document basis, (ie: generated by an external 
> system, or perhaps a timestamp of when a file was last modified) and ensure 
> that the corresponding document update was processed only if the "new" 
> version is greater then the "old" version -- w/o needing to check exactly 
> which version is currently in Solr.  (ie: If a client wants to index version 
> 101 of a doc, that update should fail if version 102 is already in the index, 
> but succeed if the currently indexed version is 99 -- w/o the client needing 
> to ask Solr what the current version)
> The idea Yonik brought up in SOLR-5298 (letting the client specify a 
> {{\_new\_version\_}} that would be used by the existing optimistic 
> concurrency code to control the assignment of the {{\_version\_}} field for 
> documents) looked like a good direction to go -- but after digging into the 
> way {{\_version\_}} is used internally I realized it requires a uniqueness 
> constraint across all update commands, that would make it impossible to allow 
> multiple independent documents to have the same {{\_version\_}}.
> So instead I've tackled the problem in a different way, using an 
> UpdateProcessor that is configured with user defined field to track a 
> "DocBasedVersion" and uses the RTG logic to figure out if the update is 
> allowed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Hadoop dependencies

2013-10-30 Thread Petar Tahchiev
Cool,

I hope they can clean 2.3 before you release 4.6 :)


2013/10/31 Steve Rowe 

> Hi Petar,
>
> This is already done, and will be included in the 4.6 release: <
> https://issues.apache.org/jira/browse/SOLR-5382>
>
> Steve
>
> On Oct 30, 2013, at 6:12 PM, Petar Tahchiev  wrote:
>
> > Hi guys,
> >
> > we're having a little chat on the Hadoop mailing list, why is Solr
> depending on hadoop 2.0.5-alpha:
> >
> >
> http://lucene.472066.n3.nabble.com/Question-on-hadoop-dependencies-td4098284.html
> >
> > Seems like the hadoop poms are bloated with unnecessary dependencies, so
> they promised to fix in the 2.3. If you migrate to 2.2 now it would be a
> lot easier to migrate to 2.3 when they are ready.
> >
> > Should I open a JIRA for that?
> >
> > --
> > Regards, Petar!
> > Karlovo, Bulgaria.
> > ---
> > Public PGP Key at:
> https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
> > Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
Regards, Petar!
Karlovo, Bulgaria.
---
Public PGP Key at:
https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611


Re: Solr Hadoop dependencies

2013-10-30 Thread Steve Rowe
Hi Petar, 

This is already done, and will be included in the 4.6 release: 


Steve

On Oct 30, 2013, at 6:12 PM, Petar Tahchiev  wrote:

> Hi guys,
> 
> we're having a little chat on the Hadoop mailing list, why is Solr depending 
> on hadoop 2.0.5-alpha:
> 
> http://lucene.472066.n3.nabble.com/Question-on-hadoop-dependencies-td4098284.html
> 
> Seems like the hadoop poms are bloated with unnecessary dependencies, so they 
> promised to fix in the 2.3. If you migrate to 2.2 now it would be a lot 
> easier to migrate to 2.3 when they are ready.
> 
> Should I open a JIRA for that?
> 
> -- 
> Regards, Petar!
> Karlovo, Bulgaria.
> ---
> Public PGP Key at: 
> https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
> Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5406) CloudSolrServer doesn't propagate request params on a delete

2013-10-30 Thread Yonik Seeley (JIRA)
Yonik Seeley created SOLR-5406:
--

 Summary: CloudSolrServer doesn't propagate request params on a 
delete
 Key: SOLR-5406
 URL: https://issues.apache.org/jira/browse/SOLR-5406
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
Priority: Minor


It appears that deletes using CloudSolrServer drop request parems.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr Hadoop dependencies

2013-10-30 Thread Petar Tahchiev
Hi guys,

we're having a little chat on the Hadoop mailing list, why is Solr
depending on hadoop 2.0.5-alpha:

http://lucene.472066.n3.nabble.com/Question-on-hadoop-dependencies-td4098284.html

Seems like the hadoop poms are bloated with unnecessary dependencies, so
they promised to fix in the 2.3. If you migrate to 2.2 now it would be a
lot easier to migrate to 2.3 when they are ready.

Should I open a JIRA for that?

-- 
Regards, Petar!
Karlovo, Bulgaria.
---
Public PGP Key at:
https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x19658550C3110611
Key Fingerprint: A369 A7EE 61BC 93A3 CDFF  55A5 1965 8550 C311 0611


[jira] [Created] (SOLR-5405) Cloud graph view not usable by color-blind users - request small tweak

2013-10-30 Thread Nathan Neulinger (JIRA)
Nathan Neulinger created SOLR-5405:
--

 Summary: Cloud graph view not usable by color-blind users - 
request small tweak
 Key: SOLR-5405
 URL: https://issues.apache.org/jira/browse/SOLR-5405
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.5
Reporter: Nathan Neulinger


Currently, the cloud view status is impossible to see easily on the graph 
screen if you are color blind. (On of my coworkers.)

Would it be possible to put " (X)" after the IP of the node where X is [LARDFG] 
for the states?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2013-10-30 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809634#comment-13809634
 ] 

Paul Elschot commented on LUCENE-5205:
--

I missed this originally, sorry about that, but I jst had a quick look at the 
patch.

I think this has a lot more possibilities than the surround parser. So much 
more that this might actually replace the surround parser.
Your target should be the queryparser module I think. Hopefully that will bring 
more users and perhaps even some maintainers.

A few details:

There is no AND query, that is a pity, but I see the point. I remember the 
struggle I had to combine Boolean and Span queries in surround.
A user interface that provides a QueryFilter might well be enough for most 
users.

Are there test cases for the recursive queries? I may have overlooked them.

The source code indentation is not 2 spaces everywhere.



> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> ---
>
> Key: LUCENE-5205
> URL: https://issues.apache.org/jira/browse/LUCENE-5205
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: SpanQueryParser_v1.patch.gz
>
>
> This parser includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
> find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
> find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "lucene" or "solr" within three words.
> * Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
> * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
> /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
> words of "ap*che" and that hit has to be within ten words of something like 
> "solr" or that "lucene" regex.
> In combination with a QueryFilter, has been very useful for concordance tasks 
> and for analytical search.  SpanQueries, of course, can also be used as a 
> Query for regular search via IndexSearcher.
> Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
> Most of the documentation is in the javadoc for SpanQueryParser.
> I'm happy to throw this in the Sandbox, if desired.
> Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input

2013-10-30 Thread David Goldfarb (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Goldfarb updated LUCENE-4072:
---

Attachment: LUCENE-4072.patch

Indeed, changing the code to iterate over codepoints fixed a majority of the 
test failures.

The random tests still fail sometimes -- I believe there's a bug in 
Normalizer2. I submitted a bug report 
[here|http://bugs.icu-project.org/trac/ticket/10524#propertyform].

> CharFilter that Unicode-normalizes input
> 
>
> Key: LUCENE-4072
> URL: https://issues.apache.org/jira/browse/LUCENE-4072
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Ippei UKAI
> Attachments: DebugCode.txt, 
> ippeiukai-ICUNormalizer2CharFilter-4752cad.zip, LUCENE-4072.patch, 
> LUCENE-4072.patch, LUCENE-4072.patch, LUCENE-4072.patch
>
>
> I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
> The benefit of having this process as CharFilter is that tokenizer can work 
> on normalised text while offset-correction ensuring fast vector highlighter 
> and other offset-dependent features do not break.
> The implementation is available at following repository:
> https://github.com/ippeiukai/ICUNormalizer2CharFilter
> Unfortunately this is my unpaid side-project and cannot spend much time to 
> merge my work to Lucene to make appropriate patch. I'd appreciate it if 
> anyone could give it a go. I'm happy to relicense it to whatever that meets 
> your needs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules

2013-10-30 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809559#comment-13809559
 ] 

Yonik Seeley commented on SOLR-5374:


hmmm, in SolrCloud mode, somewhere in the mix del_version is being dropped.  
Not sure where yet...

> Support user configured doc-centric versioning rules
> 
>
> Key: SOLR-5374
> URL: https://issues.apache.org/jira/browse/SOLR-5374
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch
>
>
> The existing optimistic concurrency features of Solr can be very handy for 
> ensuring that you are only updating/replacing the version of the doc you 
> think you are updating/replacing, w/o the risk of someone else 
> adding/removing the doc in the mean time -- but I've recently encountered 
> some situations where I really wanted to be able to let the client specify an 
> arbitrary version, on a per document basis, (ie: generated by an external 
> system, or perhaps a timestamp of when a file was last modified) and ensure 
> that the corresponding document update was processed only if the "new" 
> version is greater then the "old" version -- w/o needing to check exactly 
> which version is currently in Solr.  (ie: If a client wants to index version 
> 101 of a doc, that update should fail if version 102 is already in the index, 
> but succeed if the currently indexed version is 99 -- w/o the client needing 
> to ask Solr what the current version)
> The idea Yonik brought up in SOLR-5298 (letting the client specify a 
> {{\_new\_version\_}} that would be used by the existing optimistic 
> concurrency code to control the assignment of the {{\_version\_}} field for 
> documents) looked like a good direction to go -- but after digging into the 
> way {{\_version\_}} is used internally I realized it requires a uniqueness 
> constraint across all update commands, that would make it impossible to allow 
> multiple independent documents to have the same {{\_version\_}}.
> So instead I've tackled the problem in a different way, using an 
> UpdateProcessor that is configured with user defined field to track a 
> "DocBasedVersion" and uses the RTG logic to figure out if the update is 
> allowed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5318) Co-occurrence counts from Concordance

2013-10-30 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5318:


Description: 
This patch calculates co-occurrence statistics on search terms within a window 
of x tokens.  This can help in synonym discovery and anywhere else 
co-occurrence stats have been used.

The attached patch depends on LUCENE-5317.

Again, many thanks to my colleague, Jason Robinson, for advice in developing 
this code and for his modifications to this code to make it more Solr-friendly.

  was:
This patch calculates co-occurrence statistics on search terms within a window 
of x tokens.  This can help in synonym discovery and anywhere else 
co-occurrence stats have been used.

The attached patch depends on LUCENE-5317.

Again, many thanks to my colleague, Jason Robinson, for advice in developing 
this code.


> Co-occurrence counts from Concordance
> -
>
> Key: LUCENE-5318
> URL: https://issues.apache.org/jira/browse/LUCENE-5318
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: cooccur_v1.patch.gz
>
>
> This patch calculates co-occurrence statistics on search terms within a 
> window of x tokens.  This can help in synonym discovery and anywhere else 
> co-occurrence stats have been used.
> The attached patch depends on LUCENE-5317.
> Again, many thanks to my colleague, Jason Robinson, for advice in developing 
> this code and for his modifications to this code to make it more 
> Solr-friendly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5318) Co-occurrence counts from Concordance

2013-10-30 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5318:


Attachment: cooccur_v1.patch.gz

I'd assess this as an 80% patch.  It works, but more refactoring would make it 
more useful/extensible.  



> Co-occurrence counts from Concordance
> -
>
> Key: LUCENE-5318
> URL: https://issues.apache.org/jira/browse/LUCENE-5318
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: cooccur_v1.patch.gz
>
>
> This patch calculates co-occurrence statistics on search terms within a 
> window of x tokens.  This can help in synonym discovery and anywhere else 
> co-occurrence stats have been used.
> The attached patch depends on LUCENE-5317.
> Again, many thanks to my colleague, Jason Robinson, for advice in developing 
> this code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5318) Co-occurrence counts from Concordance

2013-10-30 Thread Tim Allison (JIRA)
Tim Allison created LUCENE-5318:
---

 Summary: Co-occurrence counts from Concordance
 Key: LUCENE-5318
 URL: https://issues.apache.org/jira/browse/LUCENE-5318
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 4.5
Reporter: Tim Allison
 Fix For: 4.6
 Attachments: cooccur_v1.patch.gz

This patch calculates co-occurrence statistics on search terms within a window 
of x tokens.  This can help in synonym discovery and anywhere else 
co-occurrence stats have been used.

The attached patch depends on LUCENE-5317.

Again, many thanks to my colleague, Jason Robinson, for advice in developing 
this code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability

2013-10-30 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5317:


Attachment: concordance_v1.patch.gz

v1 of patch attached

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5317) [PATCH] Concordance capability

2013-10-30 Thread Tim Allison (JIRA)
Tim Allison created LUCENE-5317:
---

 Summary: [PATCH] Concordance capability
 Key: LUCENE-5317
 URL: https://issues.apache.org/jira/browse/LUCENE-5317
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 4.5
Reporter: Tim Allison
 Fix For: 4.6
 Attachments: concordance_v1.patch.gz

This patch enables a Lucene-powered concordance search capability.

Concordances are extremely useful for linguists, lawyers and other analysts 
performing analytic search vs. traditional snippeting/document retrieval tasks. 
 By "analytic search," I mean that the user wants to browse every time a term 
appears (or at least the topn)  in a subset of documents and see the words 
before and after.  

Concordance technology is far simpler and less interesting than IR relevance 
models/methods, but it can be extremely useful for some use cases.

Traditional concordance sort orders are available (sort on words before the 
target, words after, target then words before and target then words after).

Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
obtain character offsets.  There is plenty of room for optimizations and 
refactoring.

Many thanks to my colleague, Jason Robinson, for input on the design of this 
patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5320) Multi level compositeId router

2013-10-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5320:
---

Attachment: SOLR-5320.patch

Some minor changes.

> Multi level compositeId router
> --
>
> Key: SOLR-5320
> URL: https://issues.apache.org/jira/browse/SOLR-5320
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Anshum Gupta
> Attachments: SOLR-5320.patch, SOLR-5320.patch, SOLR-5320.patch, 
> SOLR-5320-refactored.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This would enable multi level routing as compared to the 2 level routing 
> available as of now. On the usage bit, here's an example:
> Document Id: myapp!dummyuser!doc
> myapp!dummyuser! can be used as the shardkey for searching content for 
> dummyuser.
> myapp! can be used for searching across all users of myapp.
> I am looking at either a 3 (or 4) level routing. The 32 bit hash would then 
> comprise of 8X4 components from each part (in case of 4 level).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-30 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809357#comment-13809357
 ] 

Noble Paul commented on SOLR-5381:
--

bq.Is it not possible have a pool with X threads (X can be configurable) that 
treats external collections?

Makes sense.

> Split Clusterstate and scale 
> -
>
> Key: SOLR-5381
> URL: https://issues.apache.org/jira/browse/SOLR-5381
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> clusterstate.json is a single point of contention for all components in 
> SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
> because there are too many updates and too many nodes need to be notified of 
> the changes. As the no:of nodes go up the size of clusterstate.json keeps 
> going up and it will soon exceed the limit impossed by ZK.
> The first step is to store the shards information in separate nodes and each 
> node can just listen to the shard node it belongs to. We may also need to 
> split each collection into its own node and the clusterstate.json just 
> holding the names of the collections .
> This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5402) SolrCloud 4.5 bulk add errors in cloud setup

2013-10-30 Thread Michael Tracey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809355#comment-13809355
 ] 

Michael Tracey commented on SOLR-5402:
--

Yes, I've found that a 50 docs will trigger the problem.

> SolrCloud 4.5 bulk add errors in cloud setup
> 
>
> Key: SOLR-5402
> URL: https://issues.apache.org/jira/browse/SOLR-5402
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.5, 4.5.1
>Reporter: Sai Gadde
> Fix For: 4.6
>
>
> We use out of the box Solr 4.5.1 no customization done. If we merge documents 
> via SolrJ to a single server it is perfectly working fine.
> But as soon as we add another node to the cloud we are getting following 
> while merging documents. We merge about 500 at a time using SolrJ. These 500 
> documents in total are about few MB (1-3) in size.
> This is the error we are getting on the server (10.10.10.116 - IP is 
> irrelavent just for clarity)where merging is happening. 10.10.10.119 is the 
> new node here. This server gets RemoteSolrException
> shard update error StdNode: 
> http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
>  Illegal to have multiple roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> On the other server 10.10.10.119 we get following error
> org.apache.solr.common.SolrException: Illegal to have multiple roots (start 
> tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>   at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
>   at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>   at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
> roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12369]
>   at 
> com.ctc.wstx.sr.StreamScanner.constru

[jira] [Commented] (SOLR-5402) SolrCloud 4.5 bulk add errors in cloud setup

2013-10-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809341#comment-13809341
 ] 

Mark Miller commented on SOLR-5402:
---

Have you tried fewer than 500 docs at a time to see if the problem persists?

> SolrCloud 4.5 bulk add errors in cloud setup
> 
>
> Key: SOLR-5402
> URL: https://issues.apache.org/jira/browse/SOLR-5402
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.5, 4.5.1
>Reporter: Sai Gadde
> Fix For: 4.6
>
>
> We use out of the box Solr 4.5.1 no customization done. If we merge documents 
> via SolrJ to a single server it is perfectly working fine.
> But as soon as we add another node to the cloud we are getting following 
> while merging documents. We merge about 500 at a time using SolrJ. These 500 
> documents in total are about few MB (1-3) in size.
> This is the error we are getting on the server (10.10.10.116 - IP is 
> irrelavent just for clarity)where merging is happening. 10.10.10.119 is the 
> new node here. This server gets RemoteSolrException
> shard update error StdNode: 
> http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
>  Illegal to have multiple roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> On the other server 10.10.10.119 we get following error
> org.apache.solr.common.SolrException: Illegal to have multiple roots (start 
> tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>   at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
>   at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>   at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
> roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12369]
>   at 
> com.ctc.wstx.sr.StreamS

Re: [VOTE] Release PyLucene 4.5.1-1

2013-10-30 Thread Thomas Koch
+1

I could build JCC 2.18 and pylucene-4.5.1-1 on MacOS X (10.8.5) w. python2.7 
and  java'1.6.0_65'
all tests pass

Note (minor): I always get the following error upon the first call of "make 
install"
- that's not a real issue, the second call succeeds  - just annyoing ,-)

I'm using a virtualenv for the build but that shouldn't really matter I guess...

Reading http://pypi.python.org/simple/
No local packages or download links found for lucene==4.5.1
Best match: None
Traceback (most recent call last):
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
 line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
 line 72, in _run_code
exec code in run_globals
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/JCC-2.18-py2.7-macosx-10.8-intel.egg/jcc/__main__.py",
 line 107, in 
cpp.jcc(sys.argv)
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/JCC-2.18-py2.7-macosx-10.8-intel.egg/jcc/cpp.py",
 line 541, in jcc
egg_info, extra_setup_args)
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/JCC-2.18-py2.7-macosx-10.8-intel.egg/jcc/python.py",
 line 1894, in compile
setup(**args)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py",
 line 152, in setup
dist.run_commands()
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py",
 line 953, in run_commands
self.run_command(cmd)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py",
 line 972, in run_command
cmd_obj.run()
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/install.py",
 line 76, in run
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/install.py",
 line 104, in do_egg_install
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py",
 line 211, in run
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py",
 line 427, in easy_install
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py",
 line 478, in install_item
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py",
 line 519, in process_distribution
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py",
 line 563, in resolve
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py",
 line 799, in best_match
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py",
 line 811, in obtain
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py",
 line 434, in easy_install
  File 
"/Users/koch/.virtualenvs/pylucene/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/package_index.py",
 line 475, in fetch_distribution
AttributeError: 'NoneType' object has no attribute 'clone'
make: *** [install] Error 255
(pylucene)ios:pylucene-4.5.1-1 koch$ 


I'm using setuptools.__version__ '0.6c11' - from what I read so far here I 
guess an update of setuptools should fix this, right? 

current version is setuptools 1.1.7 - seems I've missed a decade of updates...


regards,
Thomas
--
Am 29.10.2013 um 01:46 schrieb Andi Vajda :

> 
> The PyLucene 4.5.1-1 release tracking the recent release of Apache Lucene 
> 4.5.1 is ready.
> 
> A release candidate is available from:
> http://people.apache.org/~vajda/staging_area/
> 
> A list of changes in this release can be seen at:
> http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_4_5/CHANGES
> 
> PyLucene 4.5.1 is built with JCC 2.18 included in these release artifacts:
> http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES
> 
> A list of Lucene Java changes can be seen at:
> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_5_1/lucene/CHANGES.txt
> 
> Please vote to release these artifacts as PyLucene 4.5.1-1.
> 
> Thanks !
> 
> Andi..
> 
> ps: the KEYS file for PyLucene release signing is at:
> http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
> http://people.apache.org/~vajda/staging_area/KEYS
> 
> pps: here is my +1



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2013-10-30 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809285#comment-13809285
 ] 

Tim Allison commented on LUCENE-5205:
-

If the community thinks this parser is worthwhile to add to Lucene, and if 
there's someone with the time and willingness to help me shape this into 
something worthy of the project, I'm more than happy to make modifications 
(e.g. use javacc and other refactoring).  Or, better yet, if there are mods 
that can be made to current code to include the above functionality, I'd be 
happy to work on those.

I'll post updates as I add more tests and make bug fixes to my local version.

Thank you!


> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> ---
>
> Key: LUCENE-5205
> URL: https://issues.apache.org/jira/browse/LUCENE-5205
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: SpanQueryParser_v1.patch.gz
>
>
> This parser includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
> find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
> find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "lucene" or "solr" within three words.
> * Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
> * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
> /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
> words of "ap*che" and that hit has to be within ten words of something like 
> "solr" or that "lucene" regex.
> In combination with a QueryFilter, has been very useful for concordance tasks 
> and for analytical search.  SpanQueries, of course, can also be used as a 
> Query for regular search via IndexSearcher.
> Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
> Most of the documentation is in the javadoc for SpanQueryParser.
> I'm happy to throw this in the Sandbox, if desired.
> Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2013-10-30 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5205:


Description: 
This parser includes functionality from:

* Classic QueryParser: most of its syntax
* SurroundQueryParser: recursive parsing for "near" and "not" clauses.
* ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
(wildcard, fuzzy, regex, prefix),
* AnalyzingQueryParser: has an option to analyze multiterms.


Same as classic syntax:
* term: test 
* fuzzy: roam~0.8, roam~2
* wildcard: te?t, test*, t*st
* regex: /\[mb\]oat/
* phrase: "jakarta apache"
* phrase with slop: "jakarta apache"~3
* default "or" clause: jakarta apache
* grouping "or" clause: (jakarta apache)
 
Main additions in SpanQueryParser syntax vs. classic syntax:
* Can require "in order" for phrases with slop with the \~> operator: "jakarta 
apache"\~>3
* Can specify "not near": "fever bieber"!\~3,10 ::
find "fever" but not if "bieber" appears within 3 words before or 10 words 
after it.
* Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta apache\]~3 
lucene\]\~>4 :: 
find "jakarta" within 3 words of "apache", and that hit has to be within 
four words before "lucene"
* Can also use \[\] for single level phrasal queries instead of " as in: 
\[jakarta apache\]
* Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
:: find "apache" and then either "lucene" or "solr" within three words.
* Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
* Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
/l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
words of "ap*che" and that hit has to be within ten words of something like 
"solr" or that "lucene" regex.

In combination with a QueryFilter, has been very useful for concordance tasks 
and for analytical search.  SpanQueries, of course, can also be used as a Query 
for regular search via IndexSearcher.

Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.

Most of the documentation is in the javadoc for SpanQueryParser.

I'm happy to throw this in the Sandbox, if desired.

Any and all feedback is welcome.  Thank you.

  was:
This parser includes functionality from:

*Classic QueryParser: most of its syntax
*SurroundQueryParser: recursive parsing for "near" and "not" clauses.
*ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
(wildcard, fuzzy, regex, prefix),
*AnalyzingQueryParser: has an option to analyze multiterms.

In combination with a QueryFilter, has been very useful for concordance tasks 
and for analytical search.  SpanQueries, of course, can also be used as a Query 
for regular search via IndexSearcher.

Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.

Most of the documentation is in the javadoc for SpanQueryParser.

I'm happy to throw this in the Sandbox, if desired.

Any and all feedback is welcome.  Thank you.


> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> ---
>
> Key: LUCENE-5205
> URL: https://issues.apache.org/jira/browse/LUCENE-5205
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.6
>
> Attachments: SpanQueryParser_v1.patch.gz
>
>
> This parser includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
> find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
> find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "

[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809255#comment-13809255
 ] 

ASF subversion and git services commented on LUCENE-5296:
-

Commit 1537141 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537141 ]

LUCENE-5296: clarify the 2.1B value count limit for sorted set field

> Add DirectDocValuesFormat
> -
>
> Key: LUCENE-5296
> URL: https://issues.apache.org/jira/browse/LUCENE-5296
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5296.patch
>
>
> Indexes values to disk but at search time it loads/accesses the values via 
> simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809254#comment-13809254
 ] 

ASF subversion and git services commented on LUCENE-5296:
-

Commit 1537140 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1537140 ]

LUCENE-5296: clarify the 2.1B value count limit for sorted set field

> Add DirectDocValuesFormat
> -
>
> Key: LUCENE-5296
> URL: https://issues.apache.org/jira/browse/LUCENE-5296
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5296.patch
>
>
> Indexes values to disk but at search time it loads/accesses the values via 
> simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5316) Taxonomy tree traversing improvement

2013-10-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809247#comment-13809247
 ] 

Michael McCandless commented on LUCENE-5316:


Thanks Gilad, I ran a performance test, on full English Wikipedia.  I
indexed all dims as NO_PARENTS so we exercise visiting all children
(rollup).  Also, two flat dims (username, categories) have high
cardinality ... it looks like there is a high fixed cost, because the
relatively easy queries seem to lose the most perf:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
  AndHighLow   59.93  (2.6%)   44.81  (4.9%)  
-25.2% ( -31% -  -18%)
   MedPhrase   42.82  (2.1%)   34.59  (4.1%)  
-19.2% ( -24% -  -13%)
 LowTerm   40.13  (1.8%)   32.68  (3.3%)  
-18.6% ( -23% -  -13%)
OrNotHighLow   33.20  (3.8%)   27.64  (4.2%)  
-16.8% ( -23% -   -9%)
  Fuzzy1   30.74  (1.6%)   26.15  (3.0%)  
-14.9% ( -19% -  -10%)
  Fuzzy2   24.95  (1.8%)   21.78  (2.9%)  
-12.7% ( -17% -   -8%)
 LowSloppyPhrase   24.11  (1.2%)   21.22  (2.8%)  
-12.0% ( -15% -   -8%)
OrNotHighMed   19.85  (2.8%)   17.68  (3.4%)  
-10.9% ( -16% -   -4%)
 MedSpanNear   17.94  (2.3%)   16.34  (3.4%)   
-8.9% ( -14% -   -3%)
  AndHighMed   16.26  (1.3%)   14.88  (2.1%)   
-8.4% ( -11% -   -5%)
 AndHighHigh   13.95  (0.9%)   12.91  (1.9%)   
-7.4% ( -10% -   -4%)
 Prefix3   13.24  (1.2%)   12.34  (1.9%)   
-6.8% (  -9% -   -3%)
 MedTerm   12.85  (0.8%)   12.01  (1.8%)   
-6.6% (  -9% -   -3%)
   OrNotHighHigh9.79  (1.5%)9.29  (2.2%)   
-5.1% (  -8% -   -1%)
   LowPhrase9.50  (5.5%)9.05  (4.9%)   
-4.7% ( -14% -6%)
HighTerm8.65  (1.2%)8.28  (1.7%)   
-4.2% (  -7% -   -1%)
 LowSpanNear7.40  (3.8%)7.13  (4.6%)   
-3.7% ( -11% -4%)
OrHighNotMed7.19  (1.4%)6.96  (1.7%)   
-3.2% (  -6% -0%)
   OrHighMed5.64  (1.3%)5.51  (1.4%)   
-2.3% (  -4% -0%)
   OrHighNotHigh4.80  (1.3%)4.71  (2.1%)   
-1.9% (  -5% -1%)
Wildcard4.60  (1.8%)4.51  (1.5%)   
-1.8% (  -5% -1%)
OrHighNotLow4.12  (1.1%)4.06  (1.5%)   
-1.5% (  -3% -1%)
   OrHighLow2.81  (1.1%)2.78  (1.2%)   
-1.0% (  -3% -1%)
HighSpanNear3.24  (2.9%)3.22  (3.0%)   
-0.8% (  -6% -5%)
  OrHighHigh2.09  (1.1%)2.08  (1.4%)   
-0.6% (  -3% -1%)
  IntNRQ1.50  (1.1%)1.50  (0.8%)   
-0.6% (  -2% -1%)
 MedSloppyPhrase3.20  (5.2%)3.19  (7.0%)   
-0.5% ( -11% -   12%)
  HighPhrase2.73  (6.3%)2.72  (5.7%)   
-0.3% ( -11% -   12%)
 Respell   52.93  (2.4%)   52.76  (3.2%)   
-0.3% (  -5% -5%)
HighSloppyPhrase3.28  (6.9%)3.32  (9.3%)
1.2% ( -14% -   18%)
{noformat}

Maybe ... we should allow .getChildren to return null when that ord
has no children?  I.e., I think the cost might be because are visiting
a great many ords for the "flat" fields.

Separately, perhaps taxo index could tell us when a dim is flat and
then we can avoid doing rollup.  Really for such dims, I should index
them as ALL_BUT_DIM, but it'd be nice if I do NO_PARENTS if the facets
code detected that the dim is actually flat and optimized
accordingly...


> Taxonomy tree traversing improvement
> 
>
> Key: LUCENE-5316
> URL: https://issues.apache.org/jira/browse/LUCENE-5316
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Gilad Barkai
>Priority: Minor
> Attachments: LUCENE-5316.patch
>
>
> The taxonomy traversing is done today utilizing the 
> {{ParallelTaxonomyArrays}}. In particular, two taxonomy-size {{int}} arrays 
> which hold for each ordinal it's (array #1) youngest child and (array #2) 
> older sibling.
> This is a compact way of holding the tree information in memory, but it's not 
> perfect:
> * Large (8 bytes per ordinal in memory)
> * Exposes internal implementation
> * Utilizing these arrays for tree traversing is not straight forward
> * L

Re: [VOTE] Release PyLucene 4.5.1-1

2013-10-30 Thread Michael McCandless
+1 to release.

However, when I ran the last step ("sudo make install") it ended with
this error:

Installed 
/Library/Python/2.7/site-packages/lucene-4.5.1-py2.7-macosx-10.8-x86_64.egg
Processing dependencies for lucene==4.5.1
Searching for lucene==4.5.1
Reading http://pypi.python.org/simple/lucene/
Couldn't find index page for 'lucene' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://pypi.python.org/simple/
No local packages or download links found for lucene==4.5.1
error: Could not find suitable distribution for
Requirement.parse('lucene==4.5.1')
make: *** [install] Error 1

Yet this seemed not to matter because I was able to run my usual smoke
test (index first 100K wikipedia docs & run a couple searches), and
the lucene.VERSION said 4.5.1.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Oct 28, 2013 at 8:46 PM, Andi Vajda  wrote:
>
> The PyLucene 4.5.1-1 release tracking the recent release of Apache Lucene
> 4.5.1 is ready.
>
> A release candidate is available from:
> http://people.apache.org/~vajda/staging_area/
>
> A list of changes in this release can be seen at:
> http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_4_5/CHANGES
>
> PyLucene 4.5.1 is built with JCC 2.18 included in these release artifacts:
> http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES
>
> A list of Lucene Java changes can be seen at:
> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_5_1/lucene/CHANGES.txt
>
> Please vote to release these artifacts as PyLucene 4.5.1-1.
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
> http://people.apache.org/~vajda/staging_area/KEYS
>
> pps: here is my +1


[jira] [Updated] (SOLR-5404) Fix solr example config to no longer use deprecated stuff

2013-10-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-5404:


Summary: Fix solr example config to no longer use deprecated stuff  (was: 
Fix solr config to no longer use deprecated stuff)

> Fix solr example config to no longer use deprecated stuff
> -
>
> Key: SOLR-5404
> URL: https://issues.apache.org/jira/browse/SOLR-5404
> Project: Solr
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>
> After committing SOLR-5401 to branch_4x, I noticed that the example prints 
> the following warnings on startup:
> {noformat}
> 16:09:39 WARN SolrResourceLoader
> Solr loaded a deprecated plugin/analysis class 
> [solr.JsonUpdateRequestHandler]. Please consult documentation how to replace 
> it accordingly.
> 16:09:39 WARN SolrResourceLoader
> Solr loaded a deprecated plugin/analysis class [solr.CSVRequestHandler]. 
> Please consult documentation how to replace it accordingly.
> {noformat}
> We should fix this in the example config.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5404) Fix solr config to no longer use deprecated stuff

2013-10-30 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-5404:
---

 Summary: Fix solr config to no longer use deprecated stuff
 Key: SOLR-5404
 URL: https://issues.apache.org/jira/browse/SOLR-5404
 Project: Solr
  Issue Type: Improvement
Reporter: Uwe Schindler


After committing SOLR-5401 to branch_4x, I noticed that the example prints the 
following warnings on startup:

{noformat}
16:09:39 WARN SolrResourceLoader
Solr loaded a deprecated plugin/analysis class [solr.JsonUpdateRequestHandler]. 
Please consult documentation how to replace it accordingly.
16:09:39 WARN SolrResourceLoader
Solr loaded a deprecated plugin/analysis class [solr.CSVRequestHandler]. Please 
consult documentation how to replace it accordingly.
{noformat}

We should fix this in the example config.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5308) Split all documents of a route key into another collection

2013-10-30 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5308:


Attachment: SOLR-5308.patch

Changes:
# Added add request forwarding to target collection. The incoming request on a 
target collection is handled according to the state of the node. If the update 
log is in buffering mode then the request is buffered otherwise the version set 
by the source leader is stripped and leader logic is invoked.
# Added a test with request forwarding

I'm still working on forwarding delete requests, remove routing rules after 
expiry and adding more/better tests.

> Split all documents of a route key into another collection
> --
>
> Key: SOLR-5308
> URL: https://issues.apache.org/jira/browse/SOLR-5308
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5308.patch, SOLR-5308.patch, SOLR-5308.patch, 
> SOLR-5308.patch, SOLR-5308.patch
>
>
> Enable SolrCloud users to split out a set of documents from a source 
> collection into another collection.
> This will be useful in multi-tenant environments. This feature will make it 
> possible to split a tenant out of a collection and put them into their own 
> collection which can be scaled separately.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5401) In Solr's ResourceLoader, add a check for @Deprecated annotation in the plugin/analysis/... class loading code, so we print a warning in the log if a deprecated factory c

2013-10-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-5401.
-

Resolution: Fixed

> In Solr's ResourceLoader, add a check for @Deprecated annotation in the 
> plugin/analysis/... class loading code, so we print a warning in the log if a 
> deprecated factory class is used
> --
>
> Key: SOLR-5401
> URL: https://issues.apache.org/jira/browse/SOLR-5401
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 3.6, 4.5
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5401.patch
>
>
> While changing an antique 3.6 schema.xml to Solr 4.5, I noticed that some 
> factories were deprecated in 3.x and were no longer available in 4.x (e.g. 
> "solr._Language_PorterStemFilterFactory"). If the user would have got a 
> notice before, this could have been prevented and user would have upgraded 
> before.
> In fact the factories were @Deprecated in 3.6, but the Solr loader does not 
> print any warning. My proposal is to add some simple code to 
> SolrResourceLoader that it prints a warning about the deprecated class, if 
> any configuartion setting loads a class with @Deprecated warning. So we can 
> prevent that problem in the future.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-30 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809184#comment-13809184
 ] 

Yago Riveiro commented on SOLR-5381:


{quote}
There will be a separate thread for each external collection
{quote}

If we have 100K collections means that we need 100K threads? 

They are spread around the all machines of the cluster but it's still too much.

I can be wrong but If we have 100K collections and only a 10% active at a time, 
we need allocate resource to the 100K theads.

Is it not possible have a pool with X threads (X can be configurable) that 
treats external collections?

> Split Clusterstate and scale 
> -
>
> Key: SOLR-5381
> URL: https://issues.apache.org/jira/browse/SOLR-5381
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> clusterstate.json is a single point of contention for all components in 
> SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
> because there are too many updates and too many nodes need to be notified of 
> the changes. As the no:of nodes go up the size of clusterstate.json keeps 
> going up and it will soon exceed the limit impossed by ZK.
> The first step is to store the shards information in separate nodes and each 
> node can just listen to the shard node it belongs to. We may also need to 
> split each collection into its own node and the clusterstate.json just 
> holding the names of the collections .
> This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5401) In Solr's ResourceLoader, add a check for @Deprecated annotation in the plugin/analysis/... class loading code, so we print a warning in the log if a deprecated factory

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809183#comment-13809183
 ] 

ASF subversion and git services commented on SOLR-5401:
---

Commit 1537122 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537122 ]

Merged revision(s) 1537119 from lucene/dev/trunk:
SOLR-5401: SolrResourceLoader logs a warning if a deprecated (factory) class is 
used in schema or config

> In Solr's ResourceLoader, add a check for @Deprecated annotation in the 
> plugin/analysis/... class loading code, so we print a warning in the log if a 
> deprecated factory class is used
> --
>
> Key: SOLR-5401
> URL: https://issues.apache.org/jira/browse/SOLR-5401
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 3.6, 4.5
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5401.patch
>
>
> While changing an antique 3.6 schema.xml to Solr 4.5, I noticed that some 
> factories were deprecated in 3.x and were no longer available in 4.x (e.g. 
> "solr._Language_PorterStemFilterFactory"). If the user would have got a 
> notice before, this could have been prevented and user would have upgraded 
> before.
> In fact the factories were @Deprecated in 3.6, but the Solr loader does not 
> print any warning. My proposal is to add some simple code to 
> SolrResourceLoader that it prints a warning about the deprecated class, if 
> any configuartion setting loads a class with @Deprecated warning. So we can 
> prevent that problem in the future.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5401) In Solr's ResourceLoader, add a check for @Deprecated annotation in the plugin/analysis/... class loading code, so we print a warning in the log if a deprecated factory

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809177#comment-13809177
 ] 

ASF subversion and git services commented on SOLR-5401:
---

Commit 1537119 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1537119 ]

SOLR-5401: SolrResourceLoader logs a warning if a deprecated (factory) class is 
used in schema or config

> In Solr's ResourceLoader, add a check for @Deprecated annotation in the 
> plugin/analysis/... class loading code, so we print a warning in the log if a 
> deprecated factory class is used
> --
>
> Key: SOLR-5401
> URL: https://issues.apache.org/jira/browse/SOLR-5401
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 3.6, 4.5
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5401.patch
>
>
> While changing an antique 3.6 schema.xml to Solr 4.5, I noticed that some 
> factories were deprecated in 3.x and were no longer available in 4.x (e.g. 
> "solr._Language_PorterStemFilterFactory"). If the user would have got a 
> notice before, this could have been prevented and user would have upgraded 
> before.
> In fact the factories were @Deprecated in 3.6, but the Solr loader does not 
> print any warning. My proposal is to add some simple code to 
> SolrResourceLoader that it prints a warning about the deprecated class, if 
> any configuartion setting loads a class with @Deprecated warning. So we can 
> prevent that problem in the future.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809150#comment-13809150
 ] 

ASF subversion and git services commented on LUCENE-5296:
-

Commit 1537108 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537108 ]

LUCENE-5296: add DirectDocValuesFormat

> Add DirectDocValuesFormat
> -
>
> Key: LUCENE-5296
> URL: https://issues.apache.org/jira/browse/LUCENE-5296
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5296.patch
>
>
> Indexes values to disk but at search time it loads/accesses the values via 
> simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5296) Add DirectDocValuesFormat

2013-10-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5296.


Resolution: Fixed

Thanks Adrien!

> Add DirectDocValuesFormat
> -
>
> Key: LUCENE-5296
> URL: https://issues.apache.org/jira/browse/LUCENE-5296
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5296.patch
>
>
> Indexes values to disk but at search time it loads/accesses the values via 
> simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809140#comment-13809140
 ] 

ASF subversion and git services commented on LUCENE-5296:
-

Commit 1537105 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1537105 ]

LUCENE-5296: add DirectDocValuesFormat

> Add DirectDocValuesFormat
> -
>
> Key: LUCENE-5296
> URL: https://issues.apache.org/jira/browse/LUCENE-5296
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5296.patch
>
>
> Indexes values to disk but at search time it loads/accesses the values via 
> simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809138#comment-13809138
 ] 

ASF subversion and git services commented on LUCENE-5313:
-

Commit 1537104 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1537104 ]

LUCENE-5313: leave the default to true for enablePositionIncrements

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809135#comment-13809135
 ] 

ASF subversion and git services commented on LUCENE-5313:
-

Commit 1537102 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537102 ]

LUCENE-5313: leave the default to true for enablePositionIncrements

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5316) Taxonomy tree traversing improvement

2013-10-30 Thread Gilad Barkai (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilad Barkai updated LUCENE-5316:
-

Attachment: LUCENE-5316.patch

{{TaxonomyReader.getParallelTaxonomyArrays}} is now protected, the 
implementation is only on {{DirectoryTaxonomyReader}} in which it is protected 
as well.

Parallel arrays are only used in tests ATM - all tree traversing is done using 
{{TaxonomyReader.getChildre(int ordinal)}} which is now abstract, and 
implemented in DirTaxoReader.

Mike, if you could please run this patch against the benchmarking maching it 
would be awesome - as the direct array access is now switched with a method 
call (iterator's {{.next()}}

I hope we will not see any significant degradation.

> Taxonomy tree traversing improvement
> 
>
> Key: LUCENE-5316
> URL: https://issues.apache.org/jira/browse/LUCENE-5316
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Gilad Barkai
>Priority: Minor
> Attachments: LUCENE-5316.patch
>
>
> The taxonomy traversing is done today utilizing the 
> {{ParallelTaxonomyArrays}}. In particular, two taxonomy-size {{int}} arrays 
> which hold for each ordinal it's (array #1) youngest child and (array #2) 
> older sibling.
> This is a compact way of holding the tree information in memory, but it's not 
> perfect:
> * Large (8 bytes per ordinal in memory)
> * Exposes internal implementation
> * Utilizing these arrays for tree traversing is not straight forward
> * Lose reference locality while traversing (the array is accessed in 
> increasing only entries, but they may be distant from one another)
> * In NRT, a reopen is always (not worst case) done at O(Taxonomy-size)
> This issue is about making the traversing more easy, the code more readable, 
> and open it for future improvements (i.e memory footprint and NRT cost) - 
> without changing any of the internals. 
> A later issue(s?) could be opened to address the gaps once this one is done.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 938 - Still Failing!

2013-10-30 Thread Michael McCandless
I'll dig.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Oct 30, 2013 at 8:39 AM, Policeman Jenkins Server
 wrote:
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/938/
> Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.search.suggest.analyzing.AnalyzingSuggesterTest.testRandom
>
> Error Message:
> expected:<1> but was:<4>
>
> Stack Trace:
> java.lang.AssertionError: expected:<1> but was:<4>
> at 
> __randomizedtesting.SeedInfo.seed([DBDDDABE3818D9BD:A991FFB189786FCE]:0)
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.failNotEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:128)
> at org.junit.Assert.assertEquals(Assert.java:472)
> at org.junit.Assert.assertEquals(Assert.java:456)
> at 
> org.apache.lucene.search.suggest.analyzing.AnalyzingSuggesterTest.testRandom(AnalyzingSuggesterTest.java:862)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at java.lang.Thread.run(Thr

[jira] [Commented] (SOLR-5402) SolrCloud 4.5 bulk add errors in cloud setup

2013-10-30 Thread Michael Tracey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809099#comment-13809099
 ] 

Michael Tracey commented on SOLR-5402:
--

I can confirm this issue on Jetty, both the bundled 8.x version from the 
example directory, and a 9.0.6 upgraded version of jetty with Solr 4.5.1.  I 
believe the issue is separate from the container.

> SolrCloud 4.5 bulk add errors in cloud setup
> 
>
> Key: SOLR-5402
> URL: https://issues.apache.org/jira/browse/SOLR-5402
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.5, 4.5.1
>Reporter: Sai Gadde
> Fix For: 4.6
>
>
> We use out of the box Solr 4.5.1 no customization done. If we merge documents 
> via SolrJ to a single server it is perfectly working fine.
> But as soon as we add another node to the cloud we are getting following 
> while merging documents. We merge about 500 at a time using SolrJ. These 500 
> documents in total are about few MB (1-3) in size.
> This is the error we are getting on the server (10.10.10.116 - IP is 
> irrelavent just for clarity)where merging is happening. 10.10.10.119 is the 
> new node here. This server gets RemoteSolrException
> shard update error StdNode: 
> http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
>  Illegal to have multiple roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
>   at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> On the other server 10.10.10.119 we get following error
> org.apache.solr.common.SolrException: Illegal to have multiple roots (start 
> tag in epilog?).
>  at [row,col {unknown-source}]: [1,12468]
>   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>   at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
>   at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>   at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal 

[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-30 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809096#comment-13809096
 ] 

Noble Paul commented on SOLR-5381:
--

OK ,
here is the plan to split clusterstate on a per collection basis

h2. How to use this feature?
Introduce a new option while creating a collection (external=true) . This will 
keep the state of the collection in a separate node. 
example :

http://localhost:8983/solr/admin/collections?action=CREATE&name=xcoll&numShards=5&replicationFactor=2&external=true

This will result in this following entry in clusterstate.json
{code:JavaScript}
{
 “xcoll” : {“ex”:true}
}
{code}
there will be another ZK entry which carries the actual collection information
*  /collections
** /xcoll
*** /state.json
{code:JavaScript}
{"xcoll":{
"shards":{"shard1":{
"range":”8000-b332”l,
"state":"active",
"replicas":{
   "core_node1":{
  "state":"active",
  "base_url":"http://192.168.1.5:8983/solr";,
   "core":"xcoll_shard1_replica1",
"node_name":"192.168.1.5:8983_solr",
"leader":"true",
"router":{"name":"compositeId"}}}
{code}

The main Overseer thread is responsible for creating collections and managing 
all the events for all the collections in the clusterstate.json . 
clusterstate.json is modified only when a collection is created/deleted or when 
state updates happen to “non-external” collections

Each external collection to have its own Overseer queue as follows. There will 
be a separate thread for each external collection.  

* /collections
** /xcoll
*** /overseer
 /collection-queue-work
 /queue
  /queue-work


h2. SolrJ enhancements
SolrJ would only listen to cluterstate,json. When a request comes for a 
collection ‘xcoll’
* it would first check if such a collection exists
* If yes it first looks up the details in the local cache for that collection 
* If not found in cache , it fetches the node /collections/xcoll/state.json and 
caches the information 
* Any query/update will be sent with extra query param specifying the 
collection name , shard name and range (example 
\_target_=xcoll:shard1:8000-b332) . A node would throw an error 
(INVALID_NODE) if it does not the serve the collection/shard/range combo.
* If a SolrJ gets INVALID_NODE error it  would invalidate the cache and fetch 
fresh state information for that collection (and caches it again).

h2. Changes to each Solr Node
Each node would only listen to the clusterstate.json and the states of 
collections which it is a member of. If a request comes for a collection it 
does not serve, it first checks for the \_target_ param. All collections 
present in the clusterstate.json will be deemed as collections it serves
* If the param is present and the node does not serve that 
collection/shard/range combo, an INVALID_NODE error is thrown
** If the validation succeeds it is served 
* If the param is not present and the node is a member of the collection, the 
request is served by 
** If the node is not a member of the collection,  it uses SolrJ to proxy the 
request to appropriate location

Internally , the node really does not care about the state of external 
collections. If/when it is required, the information is fetched real time from 
ZK and used and thrown away.

h2. Changes to admin GUI
External collections are not shown graphically in the admin UI . 



> Split Clusterstate and scale 
> -
>
> Key: SOLR-5381
> URL: https://issues.apache.org/jira/browse/SOLR-5381
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> clusterstate.json is a single point of contention for all components in 
> SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
> because there are too many updates and too many nodes need to be notified of 
> the changes. As the no:of nodes go up the size of clusterstate.json keeps 
> going up and it will soon exceed the limit impossed by ZK.
> The first step is to store the shards information in separate nodes and each 
> node can just listen to the shard node it belongs to. We may also need to 
> split each collection into its own node and the clusterstate.json just 
> holding the names of the collections .
> This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5403) Deduplicate multi-valued fields during atomic updates

2013-10-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta resolved SOLR-5403.


Resolution: Invalid

Thanks to Jack for pointing at this.
Seems like it can be done already using the UniqFieldsUpdateProcessor.

http://lucene.eu.apache.org/solr/4_4_0/solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html

Marking this as resolved.

> Deduplicate multi-valued fields during atomic updates
> -
>
> Key: SOLR-5403
> URL: https://issues.apache.org/jira/browse/SOLR-5403
> Project: Solr
>  Issue Type: New Feature
>Reporter: Anshum Gupta
>
> I think it'll be good to have a processor which de-duplicates multi valued 
> fields during atomic updates.
> It might makes sense to just have it in the current flow actually.
> More context: 
> http://lucene.472066.n3.nabble.com/Atomic-Updates-in-SOLR-td4098399.html



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 938 - Still Failing!

2013-10-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/938/
Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
REGRESSION:  
org.apache.lucene.search.suggest.analyzing.AnalyzingSuggesterTest.testRandom

Error Message:
expected:<1> but was:<4>

Stack Trace:
java.lang.AssertionError: expected:<1> but was:<4>
at 
__randomizedtesting.SeedInfo.seed([DBDDDABE3818D9BD:A991FFB189786FCE]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.lucene.search.suggest.analyzing.AnalyzingSuggesterTest.testRandom(AnalyzingSuggesterTest.java:862)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:695)




Build Log:
[...truncated 8459 lines...]
   [junit4] Suite: 
org.apache.lucene.search.suggest.analyzing.AnalyzingSuggesterTest
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=AnalyzingSuggesterTest -Dtests.method=testRandom 
-Dtests.seed=DBDDDABE3818D9BD -Dtests.slow=true -Dtes

[jira] [Commented] (SOLR-5311) Avoid registering replicas which are removed

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809016#comment-13809016
 ] 

ASF subversion and git services commented on SOLR-5311:
---

Commit 1537061 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537061 ]

SOLR-5311 tests were failing intermittently

> Avoid registering replicas which are removed 
> -
>
> Key: SOLR-5311
> URL: https://issues.apache.org/jira/browse/SOLR-5311
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, 
> SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch
>
>
> If a replica is removed from the clusterstate and if it comes back up it 
> should not be allowed to register. 
> Each core ,when comes up, checks if it was already registered and if yes is 
> it still there. If not ,throw an error and do an unregister . If such a 
> request come to overseer it should ignore such a core



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5084) new field type - EnumField

2013-10-30 Thread Elran Dvir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809013#comment-13809013
 ] 

Elran Dvir commented on SOLR-5084:
--

Hi Erick,

I have added the necessary JavaDocs. Now  'ant precommit' finishes successfully
I have attached updated patch.
You can ignore the 4.x version. It's only here because it's the version I 
started with.

Thank you very much!

> new field type - EnumField
> --
>
> Key: SOLR-5084
> URL: https://issues.apache.org/jira/browse/SOLR-5084
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Erick Erickson
> Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, 
> Solr-5084.patch, Solr-5084.patch, Solr-5084.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch, Solr-5084.trunk.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch
>
>
> We have encountered a use case in our system where we have a few fields 
> (Severity. Risk etc) with a closed set of values, where the sort order for 
> these values is pre-determined but not lexicographic (Critical is higher than 
> High). Generically this is very close to how enums work.
> To implement, I have prototyped a new type of field: EnumField where the 
> inputs are a closed predefined  set of strings in a special configuration 
> file (similar to currency.xml).
> The code is based on 4.2.1.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5311) Avoid registering replicas which are removed

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809015#comment-13809015
 ] 

ASF subversion and git services commented on SOLR-5311:
---

Commit 1537060 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1537060 ]

SOLR-5311 tests were failing intermittently

> Avoid registering replicas which are removed 
> -
>
> Key: SOLR-5311
> URL: https://issues.apache.org/jira/browse/SOLR-5311
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, 
> SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch
>
>
> If a replica is removed from the clusterstate and if it comes back up it 
> should not be allowed to register. 
> Each core ,when comes up, checks if it was already registered and if yes is 
> it still there. If not ,throw an error and do an unregister . If such a 
> request come to overseer it should ignore such a core



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5084) new field type - EnumField

2013-10-30 Thread Elran Dvir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elran Dvir updated SOLR-5084:
-

Attachment: Solr-5084.trunk.patch

> new field type - EnumField
> --
>
> Key: SOLR-5084
> URL: https://issues.apache.org/jira/browse/SOLR-5084
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Erick Erickson
> Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, 
> Solr-5084.patch, Solr-5084.patch, Solr-5084.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch, Solr-5084.trunk.patch, Solr-5084.trunk.patch, 
> Solr-5084.trunk.patch
>
>
> We have encountered a use case in our system where we have a few fields 
> (Severity. Risk etc) with a closed set of values, where the sort order for 
> these values is pre-determined but not lexicographic (Critical is higher than 
> High). Generically this is very close to how enums work.
> To implement, I have prototyped a new type of field: EnumField where the 
> inputs are a closed predefined  set of strings in a special configuration 
> file (similar to currency.xml).
> The code is based on 4.2.1.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Compilation of JCC fails: cannot find -lpython2.7

2013-10-30 Thread Filip Nollet
Hello Andi


Added 
  '-L/tools/general/app/python-2.7.5-rhel6/lib',

To the LFLAGS param for linux2/x86_64. Now it compiled.

Thanks for your support!


Best regards,

Filip Nollet


-Original Message-
From: Andi Vajda [mailto:va...@apache.org] 
Sent: 29 October 2013 20:00
To: pylucene-...@lucene.apache.org
Subject: Re: Compilation of JCC fails: cannot find -lpython2.7


> On Oct 29, 2013, at 15:09, Filip Nollet  wrote:
> 
> Hi all
> 
> I have a problem compiling JCC for pylucene.
> It does not find the shared Python library, while this seems to be available.
> 
> The error is printed below.
> 
> My environment set the LD_LIBRARY_PATH for Python 2.7:
> #  set | grep LD_
> LD_LIBRARY_PATH=/tools/general/app/python-2.7.5-rhel6/lib:/tools/general/app/sqlite-3.8.0.2/lib
> 
> Also, the contents of my Python setup seems fine:
> # ll /tools/general/app/python-2.7.5-rhel6/lib
> total 16512
> -r-xr-xr-x   1 root root 10355780 Oct 16 12:57 libpython2.7.a
> lrwxrwxrwx   1 root root   19 Oct 24 17:27 libpython2.7.so -> 
> libpython2.7.so.1.0
> -rwxr-xr-x+  1 root root  6053659 Oct 24 17:26 libpython2.7.so.1.0
> drwxr-xr-x+  2 root root65536 Oct 16 12:57 pkgconfig
> drwxr-xr-x+ 27 root root65536 Oct 16 12:57 python2.7
> 
> 
> Any ideas for this? Thanks in advance

Did you edit jcc's setup.py to reflect your envirronment ?
Once done you shouldn't need LD_LIBRARY_PATH either.

Andi..

> 
> 
> Regards,
> 
> 
> Filip
> 
> 
> -
> 
> 
> Applied shared mode monkey patch to:  '/tools/general/app/python-2.7.5-rhel6/lib/python2.7/site-packages/setuptools-1.1.6-py2.7.egg/setuptools/__init__.pyc'>
> Loading source files for package org.apache.jcc...
> Constructing Javadoc information...
> Standard Doclet version 1.6.0_39
> Building tree for all the packages and classes...
> Generating javadoc/org/apache/jcc//PythonException.html...
> Generating javadoc/org/apache/jcc//PythonVM.html...
> Generating javadoc/org/apache/jcc//package-frame.html...
> Generating javadoc/org/apache/jcc//package-summary.html...
> Generating javadoc/org/apache/jcc//package-tree.html...
> Generating javadoc/constant-values.html...
> Generating javadoc/serialized-form.html...
> Building index for all the packages and classes...
> Generating javadoc/overview-tree.html...
> Generating javadoc/index-all.html...
> Generating javadoc/deprecated-list.html...
> Building index for all classes...
> Generating javadoc/allclasses-frame.html...
> Generating javadoc/allclasses-noframe.html...
> Generating javadoc/index.html...
> Generating javadoc/help-doc.html...
> Generating javadoc/stylesheet.css...
> running build
> running build_py
> writing /root/tmp/pylucene-4.4.0-1/jcc/jcc/config.py
> copying jcc/config.py -> build/lib.linux-x86_64-2.7/jcc
> copying jcc/classes/org/apache/jcc/PythonVM.class -> 
> build/lib.linux-x86_64-2.7/jcc/classes/org/apache/jcc
> copying jcc/classes/org/apache/jcc/PythonException.class -> 
> build/lib.linux-x86_64-2.7/jcc/classes/org/apache/jcc
> running build_ext
> building 'jcc' extension
> gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall 
> -Wstrict-prototypes -fPIC -D_jcc_lib -DJCC_VER="2.17" 
> -I/usr/java/jdk1.6.0_39//include -I/usr/java/jdk1.6.0_39//include/linux 
> -I_jcc -Ijcc/sources 
> -I/tools/general/app/python-2.7.5-rhel6/include/python2.7 -c 
> jcc/sources/jcc.cpp -o build/temp.linux-x86_64-2.7/jcc/sources/jcc.o -DPYTHON 
> -fno-strict-aliasing -Wno-write-strings
> cc1plus: warning: command line option "-Wstrict-prototypes" is valid for 
> Ada/C/ObjC but not for C++
> In file included from 
> /tools/general/app/python-2.7.5-rhel6/include/python2.7/Python.h:8,
> from jcc/sources/jcc.cpp:24:
> /tools/general/app/python-2.7.5-rhel6/include/python2.7/pyconfig.h:1173:1: 
> warning: "_POSIX_C_SOURCE" redefined
> In file included from /usr/include/stdio.h:28,
> from jcc/sources/jcc.cpp:15:
> /usr/include/features.h:162:1: warning: this is the location of the previous 
> definition
> In file included from 
> /tools/general/app/python-2.7.5-rhel6/include/python2.7/Python.h:8,
> from jcc/sources/jcc.cpp:24:
> /tools/general/app/python-2.7.5-rhel6/include/python2.7/pyconfig.h:1195:1: 
> warning: "_XOPEN_SOURCE" redefined
> In file included from /usr/include/stdio.h:28,
> from jcc/sources/jcc.cpp:15:
> /usr/include/features.h:164:1: warning: this is the location of the previous 
> definition
> gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall 
> -Wstrict-prototypes -fPIC -D_jcc_lib -DJCC_VER="2.17" 
> -I/usr/java/jdk1.6.0_39//include -I/usr/java/jdk1.6.0_39//include/linux 
> -I_jcc -Ijcc/sources 
> -I/tools/general/app/python-2.7.5-rhel6/include/python2.7 -c 
> jcc/sources/JCCEnv.cpp -o build/temp.linux-x86_64-2.7/jcc/sources/JCCEnv.o 
> -DPYTHON -fno-strict-aliasing -Wno-write-strings
> cc1plus: warning: command line o

[jira] [Commented] (LUCENE-5310) Merge Threads unnecessarily block on SerialMergeScheduler

2013-10-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808981#comment-13808981
 ] 

Michael McCandless commented on LUCENE-5310:


{quote}
bq. CMS, also, needs to see all threads, because it (by design) stalls incoming 
index threads when there are too many merges running.

that is another bug IMO. It's not the concern of the MS to block threads since 
it might not even see all of them? 
{quote}

I think that's an important feature, not a bug: it prevents merge
starvation.

Ie, if the indexing threads are producing segments faster than merging
can merge them then those indexing threads should be stalled so
merging can catch up.

It's true it won't see all threads that enter IW, but it will see the
threads that are responsible for making new segments, and those are
the ones that need to be paused when merging is falling behind.

{quote}
Really, I mean this makes no sense at all. Ie. you have a merge running and you 
index that means you will block all threads that might trigger a merge until 
all pending merges are done? I think this is a bug! Really this interface says 
only that is will do no merges concurrently that's it. You wanna use calling 
threads for merging use SMS you don't wanna do that use CMS.
{quote}

Maybe we could improve SMS so that it wouldn't block the
segment-producing threads until a 2nd merge is queued?  (Today, it
blocks the producing threads as soon as another segment is written and
a merge is still running).


> Merge Threads unnecessarily block on SerialMergeScheduler
> -
>
> Key: LUCENE-5310
> URL: https://issues.apache.org/jira/browse/LUCENE-5310
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 4.5, 5.0
>Reporter: Simon Willnauer
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5310.patch, LUCENE-5310.patch
>
>
> I have been working on a high level merge multiplexer that shares threads 
> across different IW instances and I came across the fact that 
> SerialMergeScheduler actually blocks incoming thread is a merge in going on. 
> Yet this blocks threads unnecessarily since we pull the merges in a loop 
> anyway. We should use a tryLock operation instead of syncing the entire 
> method?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808971#comment-13808971
 ] 

ASF subversion and git services commented on LUCENE-5313:
-

Commit 1537039 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537039 ]

LUCENE-5313: move preservePositionIncrements from setter to ctor

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5313.


   Resolution: Fixed
Fix Version/s: 5.0
   4.6

Thanks Areek!

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808970#comment-13808970
 ] 

ASF subversion and git services commented on LUCENE-5313:
-

Commit 1537038 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1537038 ]

LUCENE-5313: move preservePositionIncrements from setter to ctor

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5316) Taxonomy tree traversing improvement

2013-10-30 Thread Gilad Barkai (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808968#comment-13808968
 ] 

Gilad Barkai commented on LUCENE-5316:
--

Thanks Mike, you're right on the money.

The ram consumption is indeed an issue here.
I'm not sure that the parent array is used during search at all... and perhaps 
could be removed (looking into that one as well).
The youngestChild/olderSibing arrays should and could be replaces with either a 
more compact RAM representation, or at extreme cases, even on-disk.

For a better RAM representation, the idea of a map from ord -> int[] of it's 
children is a start.
In such a case, we benefit from not holding a 'youngerChild' int for each 
ordinal in the flat dimension - as they have no children.
2nd, we could benefit from the locality of ref, as all the children are near by 
and not spread over an array of millions. The non-huge-flat dimensions will no 
longer suffer because of the other dimensions.
Also, it would make the worst case of NRT the same as the current update 
(O(Taxo-size)) but might be very small if only a few ordinals were added, as 
only their 'family' would be reallocated and managed, rather than the entire 
array.

At a further phase - a compression could be allowed into that int[] of children 
- we know the children are in ascending order, and could only encode the DGaps, 
figure the largest DGAP and use packed ints instead of the int[]. It would add 
some (I hope) minor CPU consumption to the loop, but would benefit greatly when 
it comes to RAM consumption. 
I hope that all the logic could be encapsulated in the {{ChildrenIterator}} and 
the user will benefit from a clean API and better RAM utilization.

I'll post a patch shortly, which covers the very first part - hiding the 
implementation detail of children arrays (making 
TaxoReader.getParallelTaxoArrays protected to begin with), and moving 
{{TopKFacetResultHandler}} to use {{ChildrenIterator}}. 

Currently debugging some nasty loop-as-a-recursion related bug :)

> Taxonomy tree traversing improvement
> 
>
> Key: LUCENE-5316
> URL: https://issues.apache.org/jira/browse/LUCENE-5316
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Gilad Barkai
>Priority: Minor
>
> The taxonomy traversing is done today utilizing the 
> {{ParallelTaxonomyArrays}}. In particular, two taxonomy-size {{int}} arrays 
> which hold for each ordinal it's (array #1) youngest child and (array #2) 
> older sibling.
> This is a compact way of holding the tree information in memory, but it's not 
> perfect:
> * Large (8 bytes per ordinal in memory)
> * Exposes internal implementation
> * Utilizing these arrays for tree traversing is not straight forward
> * Lose reference locality while traversing (the array is accessed in 
> increasing only entries, but they may be distant from one another)
> * In NRT, a reopen is always (not worst case) done at O(Taxonomy-size)
> This issue is about making the traversing more easy, the code more readable, 
> and open it for future improvements (i.e memory footprint and NRT cost) - 
> without changing any of the internals. 
> A later issue(s?) could be opened to address the gaps once this one is done.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5403) Deduplicate multi-valued fields during atomic updates

2013-10-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5403:
---

Description: 
I think it'll be good to have a processor which de-duplicates multi valued 
fields during atomic updates.

It might makes sense to just have it in the current flow actually.

More context: 
http://lucene.472066.n3.nabble.com/Atomic-Updates-in-SOLR-td4098399.html

  was:
I think it'll be good to have a processor which de-duplicates multi valued 
fields during atomic updates.

It might makes sense to just have it in the current flow actually.




> Deduplicate multi-valued fields during atomic updates
> -
>
> Key: SOLR-5403
> URL: https://issues.apache.org/jira/browse/SOLR-5403
> Project: Solr
>  Issue Type: New Feature
>Reporter: Anshum Gupta
>
> I think it'll be good to have a processor which de-duplicates multi valued 
> fields during atomic updates.
> It might makes sense to just have it in the current flow actually.
> More context: 
> http://lucene.472066.n3.nabble.com/Atomic-Updates-in-SOLR-td4098399.html



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5403) Deduplicate multi-valued fields during atomic updates

2013-10-30 Thread Anshum Gupta (JIRA)
Anshum Gupta created SOLR-5403:
--

 Summary: Deduplicate multi-valued fields during atomic updates
 Key: SOLR-5403
 URL: https://issues.apache.org/jira/browse/SOLR-5403
 Project: Solr
  Issue Type: New Feature
Reporter: Anshum Gupta


I think it'll be good to have a processor which de-duplicates multi valued 
fields during atomic updates.

It might makes sense to just have it in the current flow actually.





--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5313) Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester constructors

2013-10-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808959#comment-13808959
 ] 

Michael McCandless commented on LUCENE-5313:


Thanks Areek, new patch looks great; I'll commit shortly.

> Add "preservePositionIncrements" to AnalyzingSuggester and FuzzySuggester 
> constructors
> --
>
> Key: LUCENE-5313
> URL: https://issues.apache.org/jira/browse/LUCENE-5313
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5313.patch, LUCENE-5313.patch, LUCENE-5313.patch
>
>
> It would be convenient to have "preservePositionIncrements" in the suggesters 
> constructor, rather than having a setPreservePositionIncrements method. That 
> way it could be nicely used with the factory model already used by Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5316) Taxonomy tree traversing improvement

2013-10-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808951#comment-13808951
 ] 

Michael McCandless commented on LUCENE-5316:


+1 to make the API more "abstract".

It would be nice if the new API could somehow allow for efficiently 
accommodating "flat" fields, where the only hierarchy is that dim's root (e.g., 
"Country/", and then all values are immediately under that root.

Another use-case is private ords for certain dimensions (needed for LUCENE-5308.

Doesn't the current impl consume 3 ints (12 bytes) per ord?  (parents, 
children, siblings)

> Taxonomy tree traversing improvement
> 
>
> Key: LUCENE-5316
> URL: https://issues.apache.org/jira/browse/LUCENE-5316
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Gilad Barkai
>Priority: Minor
>
> The taxonomy traversing is done today utilizing the 
> {{ParallelTaxonomyArrays}}. In particular, two taxonomy-size {{int}} arrays 
> which hold for each ordinal it's (array #1) youngest child and (array #2) 
> older sibling.
> This is a compact way of holding the tree information in memory, but it's not 
> perfect:
> * Large (8 bytes per ordinal in memory)
> * Exposes internal implementation
> * Utilizing these arrays for tree traversing is not straight forward
> * Lose reference locality while traversing (the array is accessed in 
> increasing only entries, but they may be distant from one another)
> * In NRT, a reopen is always (not worst case) done at O(Taxonomy-size)
> This issue is about making the traversing more easy, the code more readable, 
> and open it for future improvements (i.e memory footprint and NRT cost) - 
> without changing any of the internals. 
> A later issue(s?) could be opened to address the gaps once this one is done.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5316) Taxonomy tree traversing improvement

2013-10-30 Thread Gilad Barkai (JIRA)
Gilad Barkai created LUCENE-5316:


 Summary: Taxonomy tree traversing improvement
 Key: LUCENE-5316
 URL: https://issues.apache.org/jira/browse/LUCENE-5316
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Gilad Barkai
Priority: Minor


The taxonomy traversing is done today utilizing the {{ParallelTaxonomyArrays}}. 
In particular, two taxonomy-size {{int}} arrays which hold for each ordinal 
it's (array #1) youngest child and (array #2) older sibling.

This is a compact way of holding the tree information in memory, but it's not 
perfect:
* Large (8 bytes per ordinal in memory)
* Exposes internal implementation
* Utilizing these arrays for tree traversing is not straight forward
* Lose reference locality while traversing (the array is accessed in increasing 
only entries, but they may be distant from one another)
* In NRT, a reopen is always (not worst case) done at O(Taxonomy-size)

This issue is about making the traversing more easy, the code more readable, 
and open it for future improvements (i.e memory footprint and NRT cost) - 
without changing any of the internals. 
A later issue(s?) could be opened to address the gaps once this one is done.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5217) disable transitive dependencies in maven config

2013-10-30 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-5217:
---

Attachment: LUCENE-5217.patch

New patch, almost there.

Changes:

* {{ivy.xml}} files and {{ivy-.xml}} files from the Ivy cache are now 
parsed with DOM+XPath instead of SAX.
* All tests pass under the maven build using the filtered POMs.
* More javadocs added.
* The filtered grandparent POM is now over 5,000 lines long (previously about 
900 lines), since, faithful to the Ant build, each Solr module depends on all 
Solr core, solrj, and example dependencies, and each one of those has to be 
excluded to thwart transitive dependency resolution.  Blech.

Left to do: verify that {{generate-maven-artifacts}} works - I haven't tried it 
yet.


> disable transitive dependencies in maven config
> ---
>
> Key: LUCENE-5217
> URL: https://issues.apache.org/jira/browse/LUCENE-5217
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-5217.patch, LUCENE-5217.patch
>
>
> Our ivy configuration does this: each dependency is specified and so we know 
> what will happen. Unfortunately the maven setup is not configured the same 
> way.
> Instead the maven setup is configured to download the internet: and it 
> excludes certain things specifically.
> This is really hard to configure and maintain: we added a 
> 'validate-maven-dependencies' that tries to fail on any extra jars, but all 
> it really does is run a license check after maven "runs". It wouldnt find 
> unnecessary dependencies being dragged in if something else in lucene was 
> using them and thus they had a license file.
> Since maven supports wildcard exclusions: MNG-3832, we can disable this 
> transitive shit completely.
> We should do this, so its configuration is the exact parallel of ivy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org