Re: subtle Solr classloader problem

2013-05-27 Thread Robert Muir
This is not a bug. It's a broken config.
On May 26, 2013 11:51 AM, Shawn Heisey s...@elyograg.org wrote:

 While looking into SOLR-4852 and testing every conceivable lib
 permutation, I ran across a second problem, I'd like to know if it
 should be considered a bug.


https://issues.apache.org/jira/browse/SOLR-4852?focusedCommentId=13667025#comment-13667025

 What I was trying to do here was split my required jars between
 ${solr.solr.home}/lib and ${solr.solr.home}/foo ... the former directory
 is automatically used for libraries, the latter was added by
 sharedLib=foo in my solr.xml.  Should this be a valid configuration?
 If not, perhaps we need to stop automatically including
 ${solr.solr.home}/lib.

 I run into the same problem (unable to find the ICUTokenizer class)
 whenever I split my jars, even though the icu analysis jar was not the
 jar that I moved.  When I first tried it, I moved the icu4j jar, but it
 also has the exact same problem when I move the mysql jar, which has
 nothing at all to do with ICU.

 Here's a Solr log (on an unpatched branch_4x) from when I moved the
 mysql jar from lib to foo.  You can see the jars that get loaded, so
 this should not be happening:

 http://apaste.info/6aK5

 If all the jars are in either lib or foo, everything works.

 Is this behavior a bug?  I am starting to think that this problem and
 the original SOLR-4852 issue are actually the same problem, and that it
 may not be a duplicate jar problem, but rather something specific and
 subtle with the ICU analysis components that happens when the
 classloader is replaced.

 Thanks,
 Shawn

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

t


[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667673#comment-13667673
 ] 

Per Steffensen commented on SOLR-4470:
--

bq. Could make an implementation which reads a system-basic-auth.properties 
from ZK (given your ZK is encrypted). That would be useful once the other issue 
about securing ZK is solved.

Guess you are talking about SOLR-4580. It is not about encryption (neither 
storage- nor transport-level). It is about authentication and authorization at 
application/API-level. But you are right, it is an option to build on top of 
this issue and allow for those internal credentials to live in ZK - just make 
sure the security issues doing that is dealt with.

bq. Q: Would it make sense to make CurrentInternalRequestFactory or 
InterSolrNodeAuthCredentialsFactory pluggable through solr.xml? Currently you 
need to patch and build Solr to change it, right?

Yes you need to patch and rebuild. But that is because it did not include as 
much code in the patch as I wanted to, and as I did for SOLR-4580. In the patch 
for SOLR-4580 I've included code so that you, through JVM params, can specify 
the name of a class which will be used as credentials/ACL-provider. The same 
should have been done in the patch for this SOLR-4470. I ought to have included 
code so that you through JVM params can point out the classes you want to be 
used as credentials-providers. Basically JVM params that is able to control 
which implementations of InterSolrNodeAuthCredentialsFactory.SubRequestFactory 
and InterSolrNodeAuthCredentialsFactory.InternalRequestFactory is to be used 
default for InterSolrNodeAuthCredentialsFactory.CurrentSubRequestFactory and 
InterSolrNodeAuthCredentialsFactory.CurrentInternalRequestFactory.

JVM params is the simplest way to control which implementations to be used 
behind the interfaces. That is, in my opinion, what should have been included 
here. Going from control through JVM params and adding support for control 
through solr.xml or something else should be another issue, but it is certainly 
a good and valid idea.

{quote}
bq. Solr is not enforcing security - the webcontainer or something else is. 

A comment right in the patch explains this is not the case - it says the web 
container authorizes any role and then new Solr code is responsible for dealing 
with role authorization. This is Solr code that can introduce security bugs. 
This is the slippery slope, this is the fuzzy line, this is the creep.
{quote}

Well you did not point out the exact comment as I asked you to, so I will have 
to guess a little. The code going into the real non-test part of Solr does not 
do anything to enforce security. It only enables a Solr admin to configure 
Solr-nodes in a way so that their inter-communication will still work if the 
Solr admin chooses to set up e.g. container-managed security.

I order to claim that my solution solves the problem I want to test that it 
does. Test strategy: Set up container-managed security and verify that all 
inter-Solr-node communitation works if you use my solution. So the test-code 
sets up container-managed security, and in there there is a comment about just 
letting the container manage authentication and handle authorization in a 
filter. But this is all a simulation of what the Solr admin decided to do to 
set up security. This is test only!

{quote}
bq. Personally I do not understand why a serious project would stay out of 
security

It's simply the current stance of the project
{quote}

Well I havnt been on the meetings or whatever where this stance was 
established, but I would imagine that this stance is about Solr not going down 
the patch of enforcing or controlling or ... security. I couldnt imagine that 
this stance is about that we will not want a SolrClould cluster to work if an 
Solr admin chooses to activate third party security in a very common way.

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make 

[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667673#comment-13667673
 ] 

Per Steffensen edited comment on SOLR-4470 at 5/27/13 10:05 AM:


bq. Could make an implementation which reads a system-basic-auth.properties 
from ZK (given your ZK is encrypted). That would be useful once the other issue 
about securing ZK is solved.

Guess you are talking about SOLR-4580. It is not about encryption (neither 
storage- nor transport-level). It is about authentication and authorization at 
application/API-level. But you are right, it is an option to build on top of 
this issue and allow for those internal credentials to live in ZK - just make 
sure the security issues doing that is dealt with.

bq. Q: Would it make sense to make CurrentInternalRequestFactory or 
InterSolrNodeAuthCredentialsFactory pluggable through solr.xml? Currently you 
need to patch and build Solr to change it, right?

Yes you need to patch and rebuild. But that is because it did not include as 
much code in the patch as I wanted to, and as I did for SOLR-4580. In the patch 
for SOLR-4580 I've included code so that you, through JVM params, can specify 
the name of a class which will be used as credentials/ACL-provider. The same 
should have been done in the patch for this SOLR-4470. I ought to have included 
code so that you, through JVM params, can point out the classes you want to be 
used as credentials-providers. Basically JVM params that is able to control 
which implementations of InterSolrNodeAuthCredentialsFactory.SubRequestFactory 
and InterSolrNodeAuthCredentialsFactory.InternalRequestFactory is to be used 
default for InterSolrNodeAuthCredentialsFactory.CurrentSubRequestFactory and 
InterSolrNodeAuthCredentialsFactory.CurrentInternalRequestFactory.

JVM params is the simplest way to control which implementations to be used 
behind the interfaces. That is, in my opinion, what should have been included 
here. Going from control through JVM params and adding support for control 
through solr.xml or something else should be another issue, but it is certainly 
a good and valid idea.

{quote}
bq. Solr is not enforcing security - the webcontainer or something else is. 

A comment right in the patch explains this is not the case - it says the web 
container authorizes any role and then new Solr code is responsible for dealing 
with role authorization. This is Solr code that can introduce security bugs. 
This is the slippery slope, this is the fuzzy line, this is the creep.
{quote}

Well you did not point out the exact comment as I asked you to, so I will have 
to guess a little. The code going into the real non-test part of Solr does not 
do anything to enforce security. It only enables a Solr admin to configure 
Solr-nodes in a way so that their inter-communication will still work if the 
Solr admin chooses to set up e.g. container-managed security.

I order to claim that my solution solves the problem I want to test that it 
does. Test strategy: Set up container-managed security and verify that all 
inter-Solr-node communitation works if you use my solution. So the test-code 
sets up container-managed security, and in there there is a comment about just 
letting the container manage authentication and handle authorization in a 
filter. But this is all a simulation of what the Solr admin decided to do to 
set up security. This is test only!

{quote}
bq. Personally I do not understand why a serious project would stay out of 
security

It's simply the current stance of the project
{quote}

Well I havnt been on the meetings or whatever where this stance was 
established, but I would imagine that this stance is about Solr not going down 
the path of enforcing or controlling or ... security. I couldnt imagine that 
this stance is about that we will not want a SolrClould cluster to work if an 
Solr admin chooses to activate third party security in a very common way.

  was (Author: steff1193):
bq. Could make an implementation which reads a system-basic-auth.properties 
from ZK (given your ZK is encrypted). That would be useful once the other issue 
about securing ZK is solved.

Guess you are talking about SOLR-4580. It is not about encryption (neither 
storage- nor transport-level). It is about authentication and authorization at 
application/API-level. But you are right, it is an option to build on top of 
this issue and allow for those internal credentials to live in ZK - just make 
sure the security issues doing that is dealt with.

bq. Q: Would it make sense to make CurrentInternalRequestFactory or 
InterSolrNodeAuthCredentialsFactory pluggable through solr.xml? Currently you 
need to patch and build Solr to change it, right?

Yes you need to patch and rebuild. But that is because it did not include as 
much code in the patch as I wanted to, and as I did for 

[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 496 - Still Failing!

2013-05-27 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/496/
Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic

Error Message:
Connection to http://localhost:51513 refused

Stack Trace:
org.apache.http.conn.HttpHostConnectException: Connection to 
http://localhost:51513 refused
at 
__randomizedtesting.SeedInfo.seed([D61BF819C5EE930E:7DE1E50C1A321520]:0)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.lucene.replicator.http.HttpClientBase.executeGET(HttpClientBase.java:178)
at 
org.apache.lucene.replicator.http.HttpReplicator.checkForUpdate(HttpReplicator.java:51)
at 
org.apache.lucene.replicator.ReplicationClient.doUpdate(ReplicationClient.java:196)
at 
org.apache.lucene.replicator.ReplicationClient.updateNow(ReplicationClient.java:402)
at 
org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic(HttpReplicatorTest.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 

[jira] [Commented] (SOLR-4744) Version conflict error during shard split test

2013-05-27 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667683#comment-13667683
 ] 

Anshum Gupta commented on SOLR-4744:


Looks fine to me other than one small change which I don't think is a part of 
your patch but would be good if fixed.

DistributedUpdateProcessor.updateAdd(): Line 404

{quote}
 if (isLeader) {
   params.set(distrib.from, ZkCoreNodeProps.getCoreUrl(
   zkController.getBaseUrl(), req.getCore().getName()));
   }

   params.set(distrib.from, ZkCoreNodeProps.getCoreUrl(
   zkController.getBaseUrl(), req.getCore().getName()));
{quote}

 Version conflict error during shard split test
 --

 Key: SOLR-4744
 URL: https://issues.apache.org/jira/browse/SOLR-4744
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 4.4

 Attachments: SOLR-4744.patch


 ShardSplitTest fails sometimes with the following error:
 {code}
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
 org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
 invoked for collection: collection1
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
 org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 
 to inactive
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
 org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
 shard1_0 to active
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.861; 
 org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
 shard1_1 to active
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.873; 
 org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= 
 path=/update params={wt=javabinversion=2} {add=[169 (1432319507166134272)]} 
 0 2
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.877; 
 org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
 WatchedEvent state:SyncConnected type:NodeDataChanged 
 path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.884; 
 org.apache.solr.update.processor.LogUpdateProcessor; 
 [collection1_shard1_1_replica1] webapp= path=/update 
 params={distrib.from=http://127.0.0.1:41028/collection1/update.distrib=FROMLEADERwt=javabindistrib.from.parent=shard1version=2}
  {} 0 1
 [junit4:junit4]   1 INFO  - 2013-04-14 19:05:26.885; 
 org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= 
 path=/update 
 params={distrib.from=http://127.0.0.1:41028/collection1/update.distrib=FROMLEADERwt=javabindistrib.from.parent=shard1version=2}
  {add=[169 (1432319507173474304)]} 0 2
 [junit4:junit4]   1 ERROR - 2013-04-14 19:05:26.885; 
 org.apache.solr.common.SolrException; shard update error StdNode: 
 http://127.0.0.1:41028/collection1_shard1_1_replica1/:org.apache.solr.common.SolrException:
  version conflict for 169 expected=1432319507173474304 actual=-1
 [junit4:junit4]   1  at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
 [junit4:junit4]   1  at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
 [junit4:junit4]   1  at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
 [junit4:junit4]   1  at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
 [junit4:junit4]   1  at 
 

A strange RemoteSolrException

2013-05-27 Thread Hans-Peter Stricker
Hello,

I'm writing my first little Solrj program, but don't get it running because of 
an RemoteSolrException: Server at http://localhost:8983/solr returned non ok 
status:404

The server is definitely running and the url works in the browser.

I am working with Solr 4.3.0.

This is my source code:

public static void main(String[] args) {

String url = http://localhost:8983/solr;;
SolrServer server;

try {
server = new HttpSolrServer(url);
server.ping();
   } catch (Exception ex) {
ex.printStackTrace();
   }
}

with the stack trace:

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server at 
http://localhost:8983/solr returned non ok status:404, message:Not Found
 at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
 at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at org.apache.solr.client.solrj.request.SolrPing.process(SolrPing.java:62)
 at org.apache.solr.client.solrj.SolrServer.ping(SolrServer.java:293)
 at de.epublius.blogindexer.App.main(App.java:47)

If I call server.shutdown(), there is no such exception, but for almost all 
other SolrServer-methods.

What am I doing wrong?

Thanks in advance

Hans-Peter

[jira] [Updated] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread Karl Wettin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wettin updated LUCENE-5013:


Attachment: LUCENE-5013-6.txt

It's all good now.

Thanks for the help and input, everybody. Have fun, and I hope someone else buy 
me finds this useful.

 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread Karl Wettin (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667708#comment-13667708
 ] 

Karl Wettin edited comment on LUCENE-5013 at 5/27/13 11:45 AM:
---

It's all good now.

Thanks for the help and input, everybody. Have fun, and I hope someone else but 
me finds this useful.

  was (Author: karl.wettin):
It's all good now.

Thanks for the help and input, everybody. Have fun, and I hope someone else buy 
me finds this useful.
  
 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667712#comment-13667712
 ] 

Jan Høydahl edited comment on SOLR-4470 at 5/27/13 12:02 PM:
-

bq. JVM params is the simplest way to control which implementations to be used 
behind the interfaces. That is, in my opinion, what should have been included 
here. Going from control through JVM params and adding support for control 
through solr.xml or something else should be another issue, but it is certainly 
a good and valid idea.

The way we normally set up {{solr.xml}} configs is with 
mytag$\{my.jvm.param\}/mytag style, so admin can choose whether to pass the 
option as JVM param or include it in solr.xml directly. Something like

{code:xml}
solr
  ...
  subRequestFactory class=${solr.subRequestFactory} /
  internalRequestFactory class=${solr.internalRequestFactory} /
/solr
{code}


Regarding [~markrmil...@gmail.com]'s concern with authorization creep, I to 
some extent agree. But since, as you say, this is test-code only, let's move 
the class {{RegExpAuthorizationFilter}} from runtime codebase and into the test 
framework. In that way, it is clear that it is only used for realistic test 
coverage. And if anyone wishes to setup a similar setup in their production 
they may borrow code from the test class, but it will be a manual step 
reinforcing that this is not a supported feature of the project as such.

  was (Author: janhoy):
bq. JVM params is the simplest way to control which implementations to be 
used behind the interfaces. That is, in my opinion, what should have been 
included here. Going from control through JVM params and adding support for 
control through solr.xml or something else should be another issue, but it is 
certainly a good and valid idea.

The way we normally set up {{solr.xml}} configs is with 
mytag$\{my.jvm.param\}/mytag style, so admin can choose whether to pass the 
option as JVM param or include it in solr.xml directly. Something like

{code:xml}
solr
  ...
  subRequestFactory class=${solr.subRequestFactory} /
  internalRequestFactory class=${solr.internalRequestFactory} /
/solr
{code}


Regarding [~markrmil...@gmail.com]'s concern with authorization creep, I to 
some extent agree. But since, as you say, this is test-code only, let's move 
the class {{RegExpAuthorizationFilter}} from runtime codebase and into the test 
framework. In that way, it is clear that it is only used for realistic test 
coverage. And if anyone wishes to setup a similar setup in their production 
they may borrow code from the test class, but it will be a manual step 
reinforcing that this is not something that the project *supports* - but *if* 
someone sets up this in the container, 
  
 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as 

[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667712#comment-13667712
 ] 

Jan Høydahl commented on SOLR-4470:
---

bq. JVM params is the simplest way to control which implementations to be used 
behind the interfaces. That is, in my opinion, what should have been included 
here. Going from control through JVM params and adding support for control 
through solr.xml or something else should be another issue, but it is certainly 
a good and valid idea.

The way we normally set up {{solr.xml}} configs is with 
mytag$\{my.jvm.param\}/mytag style, so admin can choose whether to pass the 
option as JVM param or include it in solr.xml directly. Something like

{code:xml}
solr
  ...
  subRequestFactory class=${solr.subRequestFactory} /
  internalRequestFactory class=${solr.internalRequestFactory} /
/solr
{code}


Regarding [~markrmil...@gmail.com]'s concern with authorization creep, I to 
some extent agree. But since, as you say, this is test-code only, let's move 
the class {{RegExpAuthorizationFilter}} from runtime codebase and into the test 
framework. In that way, it is clear that it is only used for realistic test 
coverage. And if anyone wishes to setup a similar setup in their production 
they may borrow code from the test class, but it will be a manual step 
reinforcing that this is not something that the project *supports* - but *if* 
someone sets up this in the container, 

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667713#comment-13667713
 ] 

Jan Høydahl commented on LUCENE-5013:
-

Can you upload the patch as LUCENE-5013.patch ? That's the standard naming 
convention around here :)

 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread Karl Wettin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wettin updated LUCENE-5013:


Attachment: LUCENE-5013.patch

Patch blessed with ASL2

 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.patch, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread Karl Wettin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wettin updated LUCENE-5013:


Attachment: (was: LUCENE-5013.patch)

 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.patch, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5013) ScandinavianInterintelligableASCIIFoldingFilter

2013-05-27 Thread Karl Wettin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wettin updated LUCENE-5013:


Attachment: LUCENE-5013.patch

Patch blessed with ASL2.

 ScandinavianInterintelligableASCIIFoldingFilter
 ---

 Key: LUCENE-5013
 URL: https://issues.apache.org/jira/browse/LUCENE-5013
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 4.3
Reporter: Karl Wettin
Priority: Trivial
 Attachments: LUCENE-5013-2.txt, LUCENE-5013-3.txt, LUCENE-5013-4.txt, 
 LUCENE-5013-5.txt, LUCENE-5013-6.txt, LUCENE-5013.patch, LUCENE-5013.txt


 This filter is an augmentation of output from ASCIIFoldingFilter,
 it discriminate against double vowels aa, ae, ao, oe and oo, leaving just the 
 first one.
 blåbærsyltetøj == blåbärsyltetöj == blaabaarsyltetoej == blabarsyltetoj
 räksmörgås == ræksmørgås == ræksmörgaos == raeksmoergaas == raksmorgas
 Caveats:
 Since this is a filtering on top of ASCIIFoldingFilter äöåøæ already has been 
 folded down to aoaoae when handled by this filter it will cause effects such 
 as:
 bøen - boen - bon
 åene - aene - ane
 I find this to be a trivial problem compared to not finding anything at all.
 Background:
 Swedish åäö is in fact the same letters as Norwegian and Danish åæø and thus 
 interchangeable in when used between these languages. They are however folded 
 differently when people type them on a keyboard lacking these characters and 
 ASCIIFoldingFilter handle ä and æ differently.
 When a Swedish person is lacking umlauted characters on the keyboard they 
 consistently type a, a, o instead of å, ä, ö. Foreigners also tend to use a, 
 a, o.
 In Norway people tend to type aa, ae and oe instead of å, æ and ø. Some use 
 a, a, o. I've also seen oo, ao, etc. And permutations. Not sure about Denmark 
 but the pattern is probably the same.
 This filter solves that problem, but might also cause new.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667754#comment-13667754
 ] 

Jan Høydahl commented on SOLR-4470:
---

I tried to move AuthCredentialsSource to test scope, but there is a 
compile-time dependency in JettySolrRunner method lifeCycleStarted(). Can we 
refactor this piece of code into test-scope as well, e.g. by exposing some a 
Filter setter on JettySolrRunner?

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667754#comment-13667754
 ] 

Jan Høydahl edited comment on SOLR-4470 at 5/27/13 1:59 PM:


I tried to move {{RegExpAuthorizationFilter}} to test scope, but there is a 
compile-time dependency in JettySolrRunner method lifeCycleStarted(). Can we 
refactor this piece of code into test-scope as well, e.g. by exposing some a 
Filter setter on JettySolrRunner?

  was (Author: janhoy):
I tried to move AuthCredentialsSource to test scope, but there is a 
compile-time dependency in JettySolrRunner method lifeCycleStarted(). Can we 
refactor this piece of code into test-scope as well, e.g. by exposing some a 
Filter setter on JettySolrRunner?
  
 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4862) Core admin action CREATE fails to persist some settings in solr.xml

2013-05-27 Thread JIRA
André Widhani created SOLR-4862:
---

 Summary: Core admin action CREATE fails to persist some settings 
in solr.xml
 Key: SOLR-4862
 URL: https://issues.apache.org/jira/browse/SOLR-4862
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.3
Reporter: André Widhani
Priority: Minor


When I create a core with Core admin handler using these request parameters:

action=CREATE
name=core-tex69bbum21ctk1kq6lmkir-index3
schema=/etc/opt/dcx/solr/conf/schema.xml
instanceDir=/etc/opt/dcx/solr/
config=/etc/opt/dcx/solr/conf/solrconfig.xml
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3

in Solr 4.1, solr.xml would have the following entry:

core schema=/etc/opt/dcx/solr/conf/schema.xml loadOnStartup=true 
instanceDir=/etc/opt/dcx/solr/ transient=false 
name=core-tex69bbum21ctk1kq6lmkir-index3 
config=/etc/opt/dcx/solr/conf/solrconfig.xml 
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3/ 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

while in Solr 4.3 schema, config and dataDir will be missing:

core loadOnStartup=true instanceDir=/etc/opt/dcx/solr/ 
transient=false name=core-tex69bbum21ctk1kq6lmkir-index3 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

The new core would use the settings specified during CREATE, but after a Solr 
restart they are lost (fall back to some defaults), as they are not persisted 
in solr.xml. I should add that solr.xml has persistent=true in the root 
element.

http://lucene.472066.n3.nabble.com/Core-admin-action-quot-CREATE-quot-fails-to-persist-some-settings-in-solr-xml-with-Solr-4-3-td4065786.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Scoring prefix query matches by matching word position

2013-05-27 Thread Han Wang
I have found a mail sent to you long ago for scoring prefix query matches
by matching word position
http://mail-archives.apache.org/mod_mbox/lucene-dev/200612.mbox/%3CF43898E1E0300149BEB43F3A66423D6C0C141AEE@ex8.hostedexchange.local%3E

May i know the offical solution for this quetion?

-- 
Tom Wang
EECS of Peking University
汪罕
北京大学信息科学技术学院计算机系


[jira] [Created] (SOLR-4863) SolrDynamicMBean still uses sourceId in dynamic stats

2013-05-27 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-4863:
---

 Summary: SolrDynamicMBean still uses sourceId in dynamic stats
 Key: SOLR-4863
 URL: https://issues.apache.org/jira/browse/SOLR-4863
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.4


As noted in solr-user:

http://www.mail-archive.com/solr-user@lucene.apache.org/msg82650.html

SOLR-3329 removed the sourceId from SolrInfoMBean but it wasn't removed from 
the dynamic stats. This leads to exceptions on access.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4862) Core admin action CREATE fails to persist some settings in solr.xml

2013-05-27 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-4862:
---

Assignee: Erick Erickson

 Core admin action CREATE fails to persist some settings in solr.xml
 -

 Key: SOLR-4862
 URL: https://issues.apache.org/jira/browse/SOLR-4862
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.3
Reporter: André Widhani
Assignee: Erick Erickson
Priority: Minor

 When I create a core with Core admin handler using these request parameters:
 action=CREATE
 name=core-tex69bbum21ctk1kq6lmkir-index3
 schema=/etc/opt/dcx/solr/conf/schema.xml
 instanceDir=/etc/opt/dcx/solr/
 config=/etc/opt/dcx/solr/conf/solrconfig.xml
 dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3
 in Solr 4.1, solr.xml would have the following entry:
 core schema=/etc/opt/dcx/solr/conf/schema.xml loadOnStartup=true 
 instanceDir=/etc/opt/dcx/solr/ transient=false 
 name=core-tex69bbum21ctk1kq6lmkir-index3 
 config=/etc/opt/dcx/solr/conf/solrconfig.xml 
 dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3/ 
 collection=core-tex69bbum21ctk1kq6lmkir-index3/
 while in Solr 4.3 schema, config and dataDir will be missing:
 core loadOnStartup=true instanceDir=/etc/opt/dcx/solr/ 
 transient=false name=core-tex69bbum21ctk1kq6lmkir-index3 
 collection=core-tex69bbum21ctk1kq6lmkir-index3/
 The new core would use the settings specified during CREATE, but after a Solr 
 restart they are lost (fall back to some defaults), as they are not persisted 
 in solr.xml. I should add that solr.xml has persistent=true in the root 
 element.
 http://lucene.472066.n3.nabble.com/Core-admin-action-quot-CREATE-quot-fails-to-persist-some-settings-in-solr-xml-with-Solr-4-3-td4065786.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5017) SpatialOpRecursivePrefixTreeTest is failing

2013-05-27 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667773#comment-13667773
 ] 

David Smiley commented on LUCENE-5017:
--

Thanks for bringing this to my attention Mike.  I'll look into it.  I wish I 
could subscribe to test failures in spatial, and if somehow test failures that 
still fail for a given seed could be tracked somewhere such that we can see 
outstanding problems that haven't been fixed.

 SpatialOpRecursivePrefixTreeTest is failing
 ---

 Key: LUCENE-5017
 URL: https://issues.apache.org/jira/browse/LUCENE-5017
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spatial
Reporter: Michael McCandless
 Fix For: 5.0, 4.4


 This has been failing lately on trunk (e.g. on rev 1486339):
 {noformat}
 ant test  -Dtestcase=SpatialOpRecursivePrefixTreeTest 
 -Dtestmethod=testContains -Dtests.seed=456022665217DADF:2C2A2816BD2BA1C5 
 -Dtests.slow=true -Dtests.locale=nl_BE -Dtests.timezone=Poland 
 -Dtests.file.encoding=ISO-8859-1
 {noformat}
 Not sure what's up ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4858) updateLog + core reload + deleteByQuery = leaked directory

2013-05-27 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667817#comment-13667817
 ] 

Anshum Gupta commented on SOLR-4858:


Flipping the core reload and delete i.e. delete followed by a core reload also 
makes it pass.

 updateLog + core reload + deleteByQuery = leaked directory
 --

 Key: SOLR-4858
 URL: https://issues.apache.org/jira/browse/SOLR-4858
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2.1
Reporter: Hoss Man
 Attachments: SOLR-4858.patch


 I havene't been able to make sense of this yet, but trying to track down 
 another bug lead me to discover that the following combination leads to 
 problems...
 * updateLog enabled
 * do a core reload
 * do a delete by query \*:\*
 ...leave out any one of the three, and everything works fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4655) The Overseer should assign node names by default.

2013-05-27 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667862#comment-13667862
 ] 

Anshum Gupta commented on SOLR-4655:


I'll just start working on this. The email for this completely skipped my eyes.

 The Overseer should assign node names by default.
 -

 Key: SOLR-4655
 URL: https://issues.apache.org/jira/browse/SOLR-4655
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, 
 SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch


 Currently we make a unique node name by using the host address as part of the 
 name. This means that if you want a node with a new address to take over, the 
 node name is misleading. It's best if you set custom names for each node 
 before starting your cluster. This is cumbersome though, and cannot currently 
 be done with the collections API. Instead, the overseer could assign a more 
 generic name such as nodeN by default. Then you can easily swap in another 
 node with no pre planning and no confusion in the name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5018) Never update offsets in CompoundWordTokenFilterBase

2013-05-27 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-5018:


 Summary: Never update offsets in CompoundWordTokenFilterBase
 Key: LUCENE-5018
 URL: https://issues.apache.org/jira/browse/LUCENE-5018
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand


CompoundWordTokenFilterBase and its children DictionaryCompoundWordTokenFilter 
and HyphenationCompoundWordTokenFilter update offsets. This can make 
OffsetAttributeImpl trip an exception when chained with other filters that 
group tokens together such as ShingleFilter, see 
http://www.gossamer-threads.com/lists/lucene/java-dev/196376?page=last.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b89) - Build # 5762 - Failure!

2013-05-27 Thread Adrien Grand
The culprit is HyphenationCompoundWordTokenFilter, I opened
https://issues.apache.org/jira/browse/LUCENE-5018.

--
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4228) SolrPing - add methods for enable/disable

2013-05-27 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4228:
---

Attachment: SOLR-4228.patch

New patch with some cleanups and CHANGES.txt that lists it as a new feature on 
version 4.4.  Like the previous patch versions, it doesn't change default 
behavior, just adds new capability.  On trunk, precommit passes and solr tests 
are underway.  If that works OK, I will commit soon.  Before committing to 4x, 
I will also give it a try in my dev environment.

 SolrPing - add methods for enable/disable
 -

 Key: SOLR-4228
 URL: https://issues.apache.org/jira/browse/SOLR-4228
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0
Reporter: Shawn Heisey
 Fix For: 4.4

 Attachments: SOLR-4228.patch, SOLR-4228.patch, SOLR-4228.patch, 
 SOLR-4228.patch, SOLR-4228.patch, SOLR-4228.patch, SOLR-4228.patch, 
 SOLR-4228.patch


 The new PingRequestHandler in Solr 4.0 takes over what actions.jsp used to do 
 in older versions.  Create methods in the SolrPing request object to access 
 this capability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4655) The Overseer should assign node names by default.

2013-05-27 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667889#comment-13667889
 ] 

Anshum Gupta commented on SOLR-4655:


I integrated the above patch and have the following tests failing on the trunk. 
[~markrmil...@gmail.com] Can you confirm that all of these tests fail for you 
as well?

[junit4:junit4]   - org.apache.solr.cloud.ShardSplitTest.testDistribSearch
[junit4:junit4]   - 
org.apache.solr.cloud.ChaosMonkeyShardSplitTest.testDistribSearch
[junit4:junit4]   - 
org.apache.solr.cloud.ClusterStateUpdateTest.testCoreRegistration
[junit4:junit4]   - 
org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch
[junit4:junit4]   - org.apache.solr.cloud.BasicDistributedZkTest (suite)

 The Overseer should assign node names by default.
 -

 Key: SOLR-4655
 URL: https://issues.apache.org/jira/browse/SOLR-4655
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, 
 SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch


 Currently we make a unique node name by using the host address as part of the 
 name. This means that if you want a node with a new address to take over, the 
 node name is misleading. It's best if you set custom names for each node 
 before starting your cluster. This is cumbersome though, and cannot currently 
 be done with the collections API. Instead, the overseer could assign a more 
 generic name such as nodeN by default. Then you can easily swap in another 
 node with no pre planning and no confusion in the name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5018) Never update offsets in CompoundWordTokenFilterBase

2013-05-27 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5018:
-

Attachment: LUCENE-5018.patch

Here is a patch.

 Never update offsets in CompoundWordTokenFilterBase
 ---

 Key: LUCENE-5018
 URL: https://issues.apache.org/jira/browse/LUCENE-5018
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-5018.patch


 CompoundWordTokenFilterBase and its children 
 DictionaryCompoundWordTokenFilter and HyphenationCompoundWordTokenFilter 
 update offsets. This can make OffsetAttributeImpl trip an exception when 
 chained with other filters that group tokens together such as ShingleFilter, 
 see http://www.gossamer-threads.com/lists/lucene/java-dev/196376?page=last.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5014) ANTLR Lucene query parser

2013-05-27 Thread Roman Chyla (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667908#comment-13667908
 ] 

Roman Chyla commented on LUCENE-5014:
-

Hi David,
In practical terms ANTLR can do exactly the same thing as PEG (ie lookahead, 
backtracking,memoization) - see this 
http://stackoverflow.com/questions/8816759/ll-versus-peg-parsers-what-is-the-difference

But it is also capable of doing more things than PEG (ie. better error recovery 
- PEG parser needs to parse the whole tree before it discovers an error; then 
the error recovery is not the same thing)

PEG's can be easier *especially* because of the first-choice operator; in fact 
at times I wished that ANTLR just chose the first available option (well, it 
does, but it reports and error and I didn't want to have grammar with errors). 
So, in CFGANTLR world, ambiguity is solved using syntactic predicated 
(lookahead) -- so far, this has been a theoretical, here are few more points:

Clarity
===

I looked at the presentation and the parser contains the operator precedence, 
however there it is spread across several screens of java code, i find the 
following much more readable

{code}
mainQ : 
  clauseOr+ EOF
  ;
  
clauseOr
  : clauseAnd (or clauseAnd )*
  ;

clauseAnd
  : clauseNot  (and clauseNot)*
  ; 
{code}
  
It is essentially the same thing, but it is independent of the Java and I can 
see it on few lines - and extend it adding few more lines. The patch I wrote 
makes the handling of separate grammar and generated code seamless. So the 2/3 
advantages of PEG over ANTLR disappear.


Syntax vs semantics (business logic)


The example from the presentation needs to be much more involved if it is to be 
used in the real life. Consider this query:

{noformat}
dog NEAR cat
{noformat}

This is going to work only in the simplest case, where each term is a single 
TermQuery. Yet if there was a synonym expansion (where would it go inside the 
PEG parser, is one question) - the parser needs to *rewrite* the query 

something like:

{noformat}
(dog|canin) NEAR cat -- (dog NEAR cat) OR (canin NEAR cat)
{noformat}

So, there you get the 'spaghetti problem' - in the example presented, the logic 
that rewrites the query must reside in the same place as the query parsing. 
That is not an improvement IMO, it is the same thing as the old Lucene parsers 
written in JavaCC which are very difficult to extend or debug

I think I'll add a new grammar with the proximity operators so that you can see 
how easy it is to solve the same situation with ANTLR (but you will need to 
read the patch this time ;)) btw. the patch is big because i included the html 
with SVG charts of the generated parse trees and one Excel file (that one helps 
in writing unittest for the grammar)

Developer vs user experience


I think PEG definitely looks simpler (in the presented example) and its main 
advantage is the first-choice operator. But since ANTLR can do the same and it 
has programming language independent grammar, it can do the same job. The 
difference may be in maturity of the project, tools available (ie debuggers) - 
and of course implementation (see the link above for details)

I can imagine that for PEG you can use your IDE of choice, while with ANTLR 
there is this 'pesky' level of abstraction - but there are tools that make life 
bearable, such as ANTLRWorks or Eclipse ANTLR debugger (though I have not liked 
that one); grammar unittest and I added ways to debug/view the grammar. Again, 
I recommend trying it, e.g. 

{code}
ant -f aqp-build.xml gunit
# edit StandardLuceneGrammar and save as 'mytestgrammar'
ant -f aqp-build.xml try-view -Dquery=foo NEAR bar -Dgrammar=mytestgrammar
{code}


There may be of course more things to consider, but I believe the 3 issues 
above present some interesting vantage points.

 ANTLR Lucene query parser
 -

 Key: LUCENE-5014
 URL: https://issues.apache.org/jira/browse/LUCENE-5014
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser, modules/queryparser
Affects Versions: 4.3
 Environment: all
Reporter: Roman Chyla
  Labels: antlr, query, queryparser
 Attachments: LUCENE-5014.txt, LUCENE-5014.txt


 I would like to propose a new way of building query parsers for Lucene.  
 Currently, most Lucene parsers are hard to extend because they are either 
 written in Java (ie. the SOLR query parser, or edismax) or the parsing logic 
 is 'married' with the query building logic (i.e. the standard lucene parser, 
 generated by JavaCC) - which makes any extension really hard.
 Few years back, Lucene got the contrib/modern query parser (later renamed to 
 'flexible'), yet that parser didn't become a star (it 

[jira] [Comment Edited] (LUCENE-5014) ANTLR Lucene query parser

2013-05-27 Thread Roman Chyla (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667908#comment-13667908
 ] 

Roman Chyla edited comment on LUCENE-5014 at 5/27/13 7:04 PM:
--

Hi David,
In practical terms ANTLR can do exactly the same thing as PEG (ie lookahead, 
backtracking,memoization) - see this 
http://stackoverflow.com/questions/8816759/ll-versus-peg-parsers-what-is-the-difference

But it is also capable of doing more things than PEG (ie. better error recovery 
- PEG parser needs to parse the whole tree before it discovers an error; then 
the error recovery is not the same thing)

PEG's can be easier *especially* because of the first-choice operator; in fact 
at times I wished that ANTLR just chose the first available option (well, it 
does, but it reports and error and I didn't want to have grammar with errors). 
So, in CFGANTLR world, ambiguity is solved using syntactic predicates 
(lookahead) -- so far, this has been a theoretical, here are few more points:

Grammar vs code
===

I looked at the presentation and the parser contains the operator precedence, 
however there it is spread across several screens of java code, i find the 
following much more readable

{code}
mainQ : 
  clauseOr+ EOF
  ;
  
clauseOr
  : clauseAnd (or clauseAnd )*
  ;

clauseAnd
  : clauseNot  (and clauseNot)*
  ; 
{code}
  
It is essentially the same thing, but it is independent of the Java and I can 
see it on few lines - and extend it adding few more lines. The patch I wrote 
makes the handling of separate grammar and generated code seamless. So the 2/3 
advantages of PEG over ANTLR disappear.


Syntax vs semantics (business logic)


The example from the presentation needs to be much more involved if it is to be 
used in the real life. Consider this query:

{noformat}
dog NEAR cat
{noformat}

This is going to work only in the simplest case, where each term is a single 
TermQuery. Yet if there was a synonym expansion (where would it go inside the 
PEG parser, is one question) - the parser needs to *rewrite* the query 

something like:

{noformat}
(dog|canin) NEAR cat -- (dog NEAR cat) OR (canin NEAR cat)
{noformat}

So, there you get the 'spaghetti problem' - in the example presented, the logic 
that rewrites the query must reside in the same place as the query parsing. 
That is not an improvement IMO, it is the same thing as the old Lucene parsers 
written in JavaCC which are very difficult to extend or debug

I think I'll add a new grammar with the proximity operators so that you can see 
how easy it is to solve the same situation with ANTLR (but you will need to 
read the patch this time ;)) btw. the patch is big because i included the html 
with SVG charts of the generated parse trees and one Excel file (that one helps 
in writing unittest for the grammar)


Developer vs user experience


I think PEG definitely looks simpler to developers (in the presented example) 
and its main advantage is the first-choice operator. But since ANTLR can do the 
same and it has programming language independent grammar, it can do the same 
job. The difference may be in maturity of the project, tools available (ie 
debuggers) - and of course implementation (see the link above for details)

I can imagine that for PEG you can use your IDE of choice, while with ANTLR 
there is this 'pesky' level of abstraction - but there are tools that make life 
bearable, such as ANTLRWorks or Eclipse ANTLR debugger (though I have not liked 
that one); grammar unittest and I added ways to debug/view the grammar. If you 
apply the patch, you can try:

{code}
ant -f aqp-build.xml gunit
# edit StandardLuceneGrammar and save as 'mytestgrammar'
ant -f aqp-build.xml try-view -Dquery=foo NEAR bar -Dgrammar=mytestgrammar
{code}


There may be of course more things to consider, but I believe the 3 issues 
above present some interesting vantage points.

  was (Author: rchyla):
Hi David,
In practical terms ANTLR can do exactly the same thing as PEG (ie lookahead, 
backtracking,memoization) - see this 
http://stackoverflow.com/questions/8816759/ll-versus-peg-parsers-what-is-the-difference

But it is also capable of doing more things than PEG (ie. better error recovery 
- PEG parser needs to parse the whole tree before it discovers an error; then 
the error recovery is not the same thing)

PEG's can be easier *especially* because of the first-choice operator; in fact 
at times I wished that ANTLR just chose the first available option (well, it 
does, but it reports and error and I didn't want to have grammar with errors). 
So, in CFGANTLR world, ambiguity is solved using syntactic predicated 
(lookahead) -- so far, this has been a theoretical, here are few more points:

Clarity
===

I looked at the presentation and the parser contains 

[jira] [Commented] (SOLR-4862) Core admin action CREATE fails to persist some settings in solr.xml

2013-05-27 Thread Li Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667923#comment-13667923
 ] 

Li Xu commented on SOLR-4862:
-

In the example given above, you don't really need to specify config and schema 
parameters. By default, Solr looks in instanceDir/conf for them. However, if 
you name your xml files different from the defaults, then this bug will cause 
you problems.

 Core admin action CREATE fails to persist some settings in solr.xml
 -

 Key: SOLR-4862
 URL: https://issues.apache.org/jira/browse/SOLR-4862
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.3
Reporter: André Widhani
Assignee: Erick Erickson
Priority: Minor

 When I create a core with Core admin handler using these request parameters:
 action=CREATE
 name=core-tex69bbum21ctk1kq6lmkir-index3
 schema=/etc/opt/dcx/solr/conf/schema.xml
 instanceDir=/etc/opt/dcx/solr/
 config=/etc/opt/dcx/solr/conf/solrconfig.xml
 dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3
 in Solr 4.1, solr.xml would have the following entry:
 core schema=/etc/opt/dcx/solr/conf/schema.xml loadOnStartup=true 
 instanceDir=/etc/opt/dcx/solr/ transient=false 
 name=core-tex69bbum21ctk1kq6lmkir-index3 
 config=/etc/opt/dcx/solr/conf/solrconfig.xml 
 dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3/ 
 collection=core-tex69bbum21ctk1kq6lmkir-index3/
 while in Solr 4.3 schema, config and dataDir will be missing:
 core loadOnStartup=true instanceDir=/etc/opt/dcx/solr/ 
 transient=false name=core-tex69bbum21ctk1kq6lmkir-index3 
 collection=core-tex69bbum21ctk1kq6lmkir-index3/
 The new core would use the settings specified during CREATE, but after a Solr 
 restart they are lost (fall back to some defaults), as they are not persisted 
 in solr.xml. I should add that solr.xml has persistent=true in the root 
 element.
 http://lucene.472066.n3.nabble.com/Core-admin-action-quot-CREATE-quot-fails-to-persist-some-settings-in-solr-xml-with-Solr-4-3-td4065786.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5019) SimpleFragmentScorer can create very long fragments

2013-05-27 Thread Alexandre Patry (JIRA)
Alexandre Patry created LUCENE-5019:
---

 Summary: SimpleFragmentScorer can create very long fragments
 Key: LUCENE-5019
 URL: https://issues.apache.org/jira/browse/LUCENE-5019
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.3
Reporter: Alexandre Patry
Priority: Minor


In SimpleFragmentScorer, when a query term is followed by a stop word, the 
fragment will run until the end of the document.

When a query term is encountered (line 80), SimpleFragmentScorer waits for the 
token following it before allowing the fragment to end (lines 68 to 72). When a 
stop word follows the query word (or any token with a position increment 
greater than 1), its position is skipped and the token SimpleFragmentScorer is 
waiting for never arrive.

The attached patch fixes that by waiting for the first token following the 
query word instead of the token at the position after the query term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5019) SimpleFragmentScorer can create very long fragments

2013-05-27 Thread Alexandre Patry (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Patry updated LUCENE-5019:


Attachment: simple-span-fragmenter.patch

A patch to fix SimpleFragmentScorer.

 SimpleFragmentScorer can create very long fragments
 ---

 Key: LUCENE-5019
 URL: https://issues.apache.org/jira/browse/LUCENE-5019
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.3
Reporter: Alexandre Patry
Priority: Minor
 Attachments: simple-span-fragmenter.patch


 In SimpleFragmentScorer, when a query term is followed by a stop word, the 
 fragment will run until the end of the document.
 When a query term is encountered (line 80), SimpleFragmentScorer waits for 
 the token following it before allowing the fragment to end (lines 68 to 72). 
 When a stop word follows the query word (or any token with a position 
 increment greater than 1), its position is skipped and the token 
 SimpleFragmentScorer is waiting for never arrive.
 The attached patch fixes that by waiting for the first token following the 
 query word instead of the token at the position after the query term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5019) SimpleFragmentScorer can create very long fragments

2013-05-27 Thread Alexandre Patry (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667932#comment-13667932
 ] 

Alexandre Patry edited comment on LUCENE-5019 at 5/27/13 7:59 PM:
--

A patch to fix SimpleSpanFragmenter.

  was (Author: apatry):
A patch to fix SimpleFragmentScorer.
  
 SimpleFragmentScorer can create very long fragments
 ---

 Key: LUCENE-5019
 URL: https://issues.apache.org/jira/browse/LUCENE-5019
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.3
Reporter: Alexandre Patry
Priority: Minor
 Attachments: simple-span-fragmenter.patch


 In SimpleFragmentScorer, when a query term is followed by a stop word, the 
 fragment will run until the end of the document.
 When a query term is encountered (line 80), SimpleFragmentScorer waits for 
 the token following it before allowing the fragment to end (lines 68 to 72). 
 When a stop word follows the query word (or any token with a position 
 increment greater than 1), its position is skipped and the token 
 SimpleFragmentScorer is waiting for never arrive.
 The attached patch fixes that by waiting for the first token following the 
 query word instead of the token at the position after the query term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5019) SimpleSpanFragmenter can create very long fragments

2013-05-27 Thread Alexandre Patry (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Patry updated LUCENE-5019:


Description: 
In SimpleSpanFragmenter, when a query term is followed by a stop word, the 
fragment will run until the end of the document.

When a query term is encountered (line 80), SimpleSpanFragmenter waits for the 
token following it before allowing the fragment to end (lines 68 to 72). When a 
stop word follows the query word (or any token with a position increment 
greater than 1), its position is skipped and the token SimpleSpanFragmenter is 
waiting for never arrive.

The attached patch fixes that by waiting for the first token following the 
query word instead of the token at the position after the query term.

  was:
In SimpleFragmentScorer, when a query term is followed by a stop word, the 
fragment will run until the end of the document.

When a query term is encountered (line 80), SimpleFragmentScorer waits for the 
token following it before allowing the fragment to end (lines 68 to 72). When a 
stop word follows the query word (or any token with a position increment 
greater than 1), its position is skipped and the token SimpleFragmentScorer is 
waiting for never arrive.

The attached patch fixes that by waiting for the first token following the 
query word instead of the token at the position after the query term.

Summary: SimpleSpanFragmenter can create very long fragments  (was: 
SimpleFragmentScorer can create very long fragments)

 SimpleSpanFragmenter can create very long fragments
 ---

 Key: LUCENE-5019
 URL: https://issues.apache.org/jira/browse/LUCENE-5019
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.3
Reporter: Alexandre Patry
Priority: Minor
 Attachments: simple-span-fragmenter.patch


 In SimpleSpanFragmenter, when a query term is followed by a stop word, the 
 fragment will run until the end of the document.
 When a query term is encountered (line 80), SimpleSpanFragmenter waits for 
 the token following it before allowing the fragment to end (lines 68 to 72). 
 When a stop word follows the query word (or any token with a position 
 increment greater than 1), its position is skipped and the token 
 SimpleSpanFragmenter is waiting for never arrive.
 The attached patch fixes that by waiting for the first token following the 
 query word instead of the token at the position after the query term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3781) Admin UI does not work when wiring Solr into a larger web application

2013-05-27 Thread Michael Chabot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667943#comment-13667943
 ] 

Michael Chabot commented on SOLR-3781:
--

FWIW, I was able to resolve this by adding a variable in LoadAdminUiServlet 
that manually holds the value of whatever's configured as 'path-prefix' in 
web.xml. See attached:
# web.xml
# LoadAdminUiServlet.java 

 Admin UI does not work when wiring Solr into a larger web application
 -

 Key: SOLR-3781
 URL: https://issues.apache.org/jira/browse/SOLR-3781
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0-BETA
 Environment: win7 jetty-distribution-7.6.5.v20120716
 startup param:
 -Djetty.port=8084 -DzkRun -Dbootstrap_conf=true
Reporter: shenjc
Priority: Minor
  Labels: patch
 Fix For: 5.0, 4.4

 Attachments: LoadAdminUiServlet.patch, 
 LoadAdminUiServlet_take2.patch, web.xml

   Original Estimate: 24h
  Remaining Estimate: 24h

 if i am wiring Solr into a larger web application which controls the web 
 context root, you will probably want to mount Solr under a path prefix 
 (app.war with /app/solr mounted into it, for example).
  For example:
 RootApp.war /
 myApp.war---/myApp
 prefixPath---xxx
 jsdir--js
 js filemain.js
 admin file-admin.html
 org.apache.solr.servlet.LoadAdminUiServlet
 line:49  InputStream in = 
 getServletContext().getResourceAsStream(/admin.html);
 can't find admin/html because it's in the prefixPath directory
 org.apache.solr.cloud.ZkController
 line:149-150
 this.nodeName = this.hostName + ':' + this.localHostPort + '_' + 
 this.localHostContext;
 this.baseURL = this.localHost + : + this.localHostPort + / + 
 this.localHostContext;
 it can't match this condition
 baseURL need to be http://xx:xx/myApp/myPrefixPath 
 eg. http://xx:xx/myApp/xxx

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3781) Admin UI does not work when wiring Solr into a larger web application

2013-05-27 Thread Michael Chabot (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Chabot updated SOLR-3781:
-

Attachment: web.xml.chabot
LoadAdminUiServlet.java.chabot

 Admin UI does not work when wiring Solr into a larger web application
 -

 Key: SOLR-3781
 URL: https://issues.apache.org/jira/browse/SOLR-3781
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0-BETA
 Environment: win7 jetty-distribution-7.6.5.v20120716
 startup param:
 -Djetty.port=8084 -DzkRun -Dbootstrap_conf=true
Reporter: shenjc
Priority: Minor
  Labels: patch
 Fix For: 5.0, 4.4

 Attachments: LoadAdminUiServlet.java.chabot, 
 LoadAdminUiServlet.patch, LoadAdminUiServlet_take2.patch, web.xml, 
 web.xml.chabot

   Original Estimate: 24h
  Remaining Estimate: 24h

 if i am wiring Solr into a larger web application which controls the web 
 context root, you will probably want to mount Solr under a path prefix 
 (app.war with /app/solr mounted into it, for example).
  For example:
 RootApp.war /
 myApp.war---/myApp
 prefixPath---xxx
 jsdir--js
 js filemain.js
 admin file-admin.html
 org.apache.solr.servlet.LoadAdminUiServlet
 line:49  InputStream in = 
 getServletContext().getResourceAsStream(/admin.html);
 can't find admin/html because it's in the prefixPath directory
 org.apache.solr.cloud.ZkController
 line:149-150
 this.nodeName = this.hostName + ':' + this.localHostPort + '_' + 
 this.localHostContext;
 this.baseURL = this.localHost + : + this.localHostPort + / + 
 this.localHostContext;
 it can't match this condition
 baseURL need to be http://xx:xx/myApp/myPrefixPath 
 eg. http://xx:xx/myApp/xxx

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4655) The Overseer should assign node names by default.

2013-05-27 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667974#comment-13667974
 ] 

Mark Miller commented on SOLR-4655:
---

The last patch for me only had the shard split tests failing - ill try and 
update to trunk tomorrow. 

 The Overseer should assign node names by default.
 -

 Key: SOLR-4655
 URL: https://issues.apache.org/jira/browse/SOLR-4655
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.4

 Attachments: SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, 
 SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch, SOLR-4655.patch


 Currently we make a unique node name by using the host address as part of the 
 name. This means that if you want a node with a new address to take over, the 
 node name is misleading. It's best if you set custom names for each node 
 before starting your cluster. This is cumbersome though, and cannot currently 
 be done with the collections API. Instead, the overseer could assign a more 
 generic name such as nodeN by default. Then you can easily swap in another 
 node with no pre planning and no confusion in the name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #863: POMs out of sync

2013-05-27 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/863/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
shard1 is not consistent.  Got 305 from 
http://127.0.0.1:13855/ho_u/collection1lastClient and got 5 from 
http://127.0.0.1:52784/ho_u/collection1

Stack Trace:
java.lang.AssertionError: shard1 is not consistent.  Got 305 from 
http://127.0.0.1:13855/ho_u/collection1lastClient and got 5 from 
http://127.0.0.1:52784/ho_u/collection1
at 
__randomizedtesting.SeedInfo.seed([FFB2A02FFD7E0450:7E542E378A21646C]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:238)




Build Log:
[...truncated 24240 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 513 - Failure!

2013-05-27 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/513/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseSerialGC

1 tests failed.
REGRESSION:  org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary

Error Message:
IOException occured when talking to server at: 
https://127.0.0.1:51424/solr/collection1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: https://127.0.0.1:51424/solr/collection1
at 
__randomizedtesting.SeedInfo.seed([E3812F49C4F5D7AD:B1F76156431A0DBE]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:435)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146)
at 
org.apache.solr.client.solrj.TestBatchUpdate.doIt(TestBatchUpdate.java:130)
at 
org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary(TestBatchUpdate.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 

[jira] [Created] (SOLR-4864) RegexReplaceProcessorFactory should support pattern capture group substitution in replacement string

2013-05-27 Thread Jack Krupansky (JIRA)
Jack Krupansky created SOLR-4864:


 Summary: RegexReplaceProcessorFactory should support pattern 
capture group substitution in replacement string
 Key: SOLR-4864
 URL: https://issues.apache.org/jira/browse/SOLR-4864
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.3
Reporter: Jack Krupansky


It is unfortunate the the replacement string for RegexReplaceProcessorFactory 
is a pure, quoted (escaped) literal and does not support pattern capture 
group substitution. This processor should be enhanced to support full, standard 
pattern capture group substitution.

The test case I used:

{code}
  updateRequestProcessorChain name=regex-mark-special-words
processor class=solr.RegexReplaceProcessorFactory
  str name=fieldRegex.*/str
  str name=pattern([^a-zA-Z]|^)(cat|dog|fox)([^a-zA-Z]|$)/str
  str name=replacement$1lt;lt;$2gt;gt;$3/str
/processor
processor class=solr.LogUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain
{code}

Indexing with this command against the standard Solr example with the above 
addition to solrconfig:

{code}
  curl 
http://localhost:8983/solr/update?commit=trueupdate.chain=regex-mark-special-words;
 \
  -H 'Content-type:application/json' -d '
  [{id: doc-1,
title: Hello World,
content: The cat and the dog jumped over the fox.,
other_ss: [cat,cat bird, lazy dog, red fox den]}]'
{code}

Alas, the resulting document consists of:

{code}
  id:doc-1,
  title:[Hello World],
  content:[The$1$2$3and the$1$2$3jumped over the$1$2$3],
  other_ss:[$1$2$3,
$1$2$3bird,
lazy$1$2$3,
red$1$2$3den],
{code}

The Javadoc for RegexReplaceProcessorFactory uses the exact same terminology of 
 replacement string, as does Java's Matcher.replaceAll, but clearly the 
semantics are distinct, with replaceAll supporting pattern capture group 
substitution for its replacement string, while RegexReplaceProcessorFactory 
interprets replacement string as being a literal. At a minimum, the 
RegexReplaceProcessorFactory Javadoc should explicitly state that the string is 
a literal that does not support pattern capture group substitution.

The relevant code in RegexReplaceProcessorFactory#init:

{code}
replacement = Matcher.quoteReplacement(replacementParam.toString());
{code}

Possible options for the enhancement:

1. Simply skip the quoteReplacement and fully support pattern capture group 
substitution with no additional changes. Does have a minor backcompat issue.

2. Add an alternative to replacement, say nonQuotedReplacement that is not 
quoted as replacement is.

3. Add an option, say quotedReplacement that defaults to true for 
backcompat, but can be set to false to support full replaceAll pattern 
capture group substitution.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5020) Make DrillSidewaysResult ctor public

2013-05-27 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5020:
--

 Summary: Make DrillSidewaysResult ctor public
 Key: LUCENE-5020
 URL: https://issues.apache.org/jira/browse/LUCENE-5020
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor


DrillSidewaysResult has a package-private ctor which prevents initializing it 
by an app. I found that it's sometimes useful for e.g. doing some 
post-processing on the returned TopDocs or ListFacetResult. Since you cannot 
return two values from a method, it will be convenient if  method could return 
a new 'processed' DSR.

I would also like to make the hits member final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5020) Make DrillSidewaysResult ctor public

2013-05-27 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5020:
---

Attachment: LUCENE-5020.patch

Trivial patch. I also clarified the jdocs of DSR and made hits final. I plan to 
commit this later today.

 Make DrillSidewaysResult ctor public
 

 Key: LUCENE-5020
 URL: https://issues.apache.org/jira/browse/LUCENE-5020
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Attachments: LUCENE-5020.patch


 DrillSidewaysResult has a package-private ctor which prevents initializing it 
 by an app. I found that it's sometimes useful for e.g. doing some 
 post-processing on the returned TopDocs or ListFacetResult. Since you 
 cannot return two values from a method, it will be convenient if  method 
 could return a new 'processed' DSR.
 I would also like to make the hits member final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-05-27 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13668111#comment-13668111
 ] 

Per Steffensen commented on SOLR-4470:
--

bq. Regarding Mark Miller's concern with authorization creep, I to some 
extent agree. But since, as you say, this is test-code only, let's move the 
class RegExpAuthorizationFilter from runtime codebase and into the test 
framework. In that way, it is clear that it is only used for realistic test 
coverage. And if anyone wishes to setup a similar setup in their production 
they may borrow code from the test class, but it will be a manual step 
reinforcing that this is not a supported feature of the project as such.

RegExpAuthorizationFilter could be in test-code because it is only used for 
test. As I described above (somewhere) it is just does something I believe a 
lot of people might want to do - simply because Solr-URLs are kinda 
upside-down. Therefore I put it in the non-test part of the code, so that 
people could use it if they found it useful. And described a little about it on 
http://wiki.apache.org/solr/SolrSecurity#Security_in_Solr_on_per-operation_basis.

If you do not get the point that this is something you can use if you decide to 
use it and that it is really not a Solr-thing, then I agree it could be 
considered creeping into dealing with security enforcement in Solr. You are 
welcome to move it to test-code, but then we should change the descriptions on 
http://wiki.apache.org/solr/SolrSecurity#Security_in_Solr_on_per-operation_basis.
 Either remove descriptions about RegExpAuthorizationFilter or include the code 
from it directly on the Wiki-page for inspiration or make a pointer to 
test-code and tell that you can steel it there.

bq. I tried to move RegExpAuthorizationFilter to test scope, but there is a 
compile-time dependency in JettySolrRunner method lifeCycleStarted(). Can we 
refactor this piece of code into test-scope as well, e.g. by exposing some a 
Filter setter on JettySolrRunner?

Isnt JettySolrRunner only used for test? Why is it not in test-scope itself? 
Maybe in test-framework? We can make a funny refactor and expose a 
filter-setter but it seems like a strange thing to do to let JettySolrRunner, 
which is only used for test, be able to use some test-stuff. What did I miss?

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: 4.4

 Attachments: SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org