[jira] [Commented] (SOLR-13452) Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.

2019-08-16 Thread Steve Davids (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909125#comment-16909125
 ] 

Steve Davids commented on SOLR-13452:
-

I just noticed that this was being worked on via the mailing list and am glad 
that the build system is being modernized. I have been working with Gradle for 
quite some time now and have found that moving to gradle's kotlin script to be 
very nice for code completion + static analysis (IntelliJ support is 
fantastic). I wanted to provide an example of two files of what a build.gradle 
vs build.gradle.kts file would like like here:

[https://github.com/apache/lucene-solr/compare/jira/SOLR-13452_gradle_5...sdavids13:jira/SOLR-13452_gradle_5]

If you would like to move towards adopting Kotlin script I can help out (I have 
migrated all of my work builds over to kts so have some experience doing so). 
The nice thing being is that you can migrate one file at a time as both the 
older style `build.gradle` and newer style `build.gradle.kts` can co-exist in 
the same repository as migrations take place.

> Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.
> -
>
> Key: SOLR-13452
> URL: https://issues.apache.org/jira/browse/SOLR-13452
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: gradle-build.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I took some things from the great work that Dat did in 
> [https://github.com/apache/lucene-solr/tree/jira/gradle] and took the ball a 
> little further.
>  
> When working with gradle in sub modules directly, I recommend 
> [https://github.com/dougborg/gdub]
> This gradle branch uses the following plugin for version locking, version 
> configuration and version consistency across modules: 
> [https://github.com/palantir/gradle-consistent-versions]
>  
> https://github.com/apache/lucene-solr/tree/jira/SOLR-13452_gradle_5



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Inquiry for Solr Cloud Authentication

2017-01-24 Thread Steve Davids
Don't let "basic auth" without SSL think there is inherent security, if
someone has access to the network it is trivial to sniff network traffic
and pickup the username/password (as noted in the caveats section
).
There is a little more overhead for setting up SSL connections but as long
as there is keep-alives it's not too big of a penalty. Another option could
be using two-way SSL for authentication purposes.

-Steve

On Fri, Jan 20, 2017 at 1:00 AM, Byunghoon Lim  wrote:

> Hi Ishan! Thanks for the advice :) I will try it.
>
> Best,
> Hoon
>
> Regards,
> Hoon
>
> On Fri, Jan 20, 2017 at 11:55 AM, Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Basic auth should have the best performance (almost negligible difference
>> between unsecured and secured). Also, you could try delegation tokens
>> support.
>>
>> On Fri, Jan 20, 2017 at 7:41 AM, Byunghoon Lim 
>> wrote:
>>
>>> Hi! I am Hoon who is running a Solr cluster on production.
>>>
>>> I am considering adding an authentication for Solr Cloud using like
>>> Basic Authentication plugin. Here, my concern is, if I bring the
>>> authentication plugin to the cluster, how much latency will increase or
>>> affected.
>>>
>>> Is there any results or data for the performance?
>>> or, please let me know the best authentication plugin which doesn't
>>> affect latency.
>>>
>>> Thanks in advance.
>>>
>>> Best,
>>> Hoon
>>>
>>
>>
>


[jira] [Commented] (SOLR-4907) Discuss and create instructions for taking Solr from the example to robust multi-server production

2016-09-13 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489470#comment-15489470
 ] 

Steve Davids commented on SOLR-4907:


Thanks, I just updated the repo to move solr.in.sh to /etc/default/. In an 
ideal world the current Lucene/Solr build system would be modernized a bit 
(LUCENE-5755) and would then allow you to build the RPM + DEB packages along 
with the ZIP and TAR files which would all be uploaded to the mirrors with a 
standard release. The nice thing with using a native package installer is that 
clients can easily uninstall the package if they don't want it and during 
upgrades old items are cleaned up and removed appropriately since all of the 
files are being tracked. I personally think it's just one less barrier to entry 
and much more natural than: `wget 
http://apache.claz.org/lucene/solr/6.2.0/solr-6.2.0-src.tgz && tar xzf 
solr-6.2.0.tgz solr-6.2.0/bin/install_solr_service.sh --strip-components=2`.

> Discuss and create instructions for taking Solr from the example to robust 
> multi-server production
> --
>
> Key: SOLR-4907
> URL: https://issues.apache.org/jira/browse/SOLR-4907
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shawn Heisey
> Attachments: SOLR-4907-install.sh
>
>
> There are no good step-by-step instructions for taking the Solr example and 
> producing a robust production setup on multiple servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: lucene-solr:master: added a couple of extra methods

2016-05-04 Thread Steve Davids
Looks like both Kotlin and C# also use first() and second() for their Pair 
classes:

https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/-pair/#properties 

https://msdn.microsoft.com/en-us/library/system.web.ui.pair(v=vs.110).aspx#Anchor_4
 


Though, out of curiosity why not just use the Pair class in Apache 
Commons-Lang? 
http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/tuple/Pair.html
 


-Steve

> On May 4, 2016, at 4:08 PM, David Smiley  wrote:
> 
> LOL  we need javadocs on more methods (and classes) indeed.
> 
> On Wed, May 4, 2016 at 2:29 PM Chris Hostetter  > wrote:
> 
> Or maybe methodWithFuckingJavadocsExplainingItsExistence() and
> otherMethodWIthJavadocsSoUsersDontHaveToGuessIfThereIsADiffBetweenGetKeyAnd_1()
> 
> how do those method names sound?
> 
> 
> : Date: Wed, 4 May 2016 14:26:41 -0400
> : From: Scott Blum >
> : Reply-To: dev@lucene.apache.org 
> : To: dev@lucene.apache.org 
> : Subject: Re: lucene-solr:master: added a couple of extra methods
> :
> : Or left() and right()
> :
> : On Wed, May 4, 2016 at 2:18 PM, Ishan Chattopadhyaya <
> : ichattopadhy...@gmail.com > wrote:
> :
> : > Another option to consider could be: first() and second()
> : >
> : > C++ uses it: http://www.cplusplus.com/reference/utility/pair/ 
> 
> : >
> : > On Wed, May 4, 2016 at 11:44 PM, Noble Paul  > wrote:
> : >
> : >> The names getKey() and getValue() are not always relevant for a pair
> : >> object. it's not necessarily a key and value. In that case, it makes 
> sense
> : >> to use the index .
> : >>
> : >>
> : >> This is a convention followed Scala. Tuple2 (
> : >> http://www.scala-lang.org/api/rc2/scala/Tuple2.html 
>  ) to Tuple10 (
> : >> http://www.scala-lang.org/api/rc2/scala/Tuple10.html 
> )
> : >>
> : >> On Wed, May 4, 2016 at 4:32 AM, Chris Hostetter 
> 
> : >> > wrote:
> : >>
> : >>>
> : >>> WTF is this?
> : >>>
> : >>> why are these (poorly named) alternatives for getKey and getValue 
> useful?
> : >>>
> : >>>
> : >>> : Date: Tue,  3 May 2016 15:09:08 + (UTC)
> : >>> : From: no...@apache.org 
> : >>> : Reply-To: dev@lucene.apache.org 
> : >>> : To: comm...@lucene.apache.org 
> : >>> : Subject: lucene-solr:master: added a couple of extra methods
> : >>> :
> : >>> : Repository: lucene-solr
> : >>> : Updated Branches:
> : >>> :   refs/heads/master 0ebe6b0f7 -> 184da9982
> : >>> :
> : >>> :
> : >>> : added a couple of extra methods
> : >>> :
> : >>> :
> : >>> : Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo 
> 
> : >>> : Commit:
> : >>> http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/184da998 
> 
> : >>> : Tree: 
> http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/184da998 
> 
> : >>> : Diff: 
> http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/184da998 
> 
> : >>> :
> : >>> : Branch: refs/heads/master
> : >>> : Commit: 184da9982c55fac4735abf01607e4f8f70eb5749
> : >>> : Parents: 0ebe6b0
> : >>> : Author: Noble Paul  >
> : >>> : Authored: Tue May 3 20:34:36 2016 +0530
> : >>> : Committer: Noble Paul  >
> : >>> : Committed: Tue May 3 20:34:36 2016 +0530
> : >>> :
> : >>> : --
> : >>> :  solr/solrj/src/java/org/apache/solr/common/util/Pair.java | 8 
> 
> : >>> :  1 file changed, 8 insertions(+)
> : >>> : --
> : >>> :
> : >>> :
> : >>> :
> : >>> 
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/184da998/solr/solrj/src/java/org/apache/solr/common/util/Pair.java
>  
> 
> : >>> : --
> : >>> : diff --git 

Re: Splitting Solr artifacts so the main download is smaller

2016-04-04 Thread Steve Davids
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.


Even though you didn't want to get into it here, I did create a Solr
RPM/DEB builder here: https://github.com/sdavids13/solr-os-packager

Sure would be pretty sweet to get an official RPM distribution, I think
that would make a lot of admin's lives easier (primarily for upgrades).

-Steve


On Mon, Apr 4, 2016 at 6:56 PM, Shawn Heisey  wrote:

> On 4/4/2016 12:57 PM, Jan Høydahl wrote:
> > A difference from ES is that they have a working plugin ecosystem, so
> > you can tell users to run "bin/plugin install kuromoji” or whatever.
> > Could we not continue working on SOLR-5103, and the size issue will
> > solve itself in a much more elegant way...
>
> Sure.  I love the idea of a plugin system that can reach out and install
> functionality from the Internet.  Would that need something new from
> Infra?  That's something we can hammer out on the Jira issue.
>
> I think step one for SOLR-5103 is to split the artifacts in a manner
> similar to what I outlined in SOLR-6806.  Then the other artifacts can
> be further diced up into small pieces that can be handled by a plugin
> system.  We don't necessarily need to do these separately, though --
> SOLR-5103 could absorb and replace SOLR-6806.
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] [Commented] (SOLR-6741) IPv6 Field Type

2016-03-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186415#comment-15186415
 ] 

Steve Davids commented on SOLR-6741:


Any updates on this ticket regarding rolling with IPv4 support with IPv6 being 
added later?

> IPv6 Field Type
> ---
>
> Key: SOLR-6741
> URL: https://issues.apache.org/jira/browse/SOLR-6741
> Project: Solr
>  Issue Type: Improvement
>Reporter: Lloyd Ramey
> Attachments: SOLR-6741.patch
>
>
> It would be nice if Solr had a field type which could be used to index IPv6 
> data and supported efficient range queries. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving Lucene / Solr from SVN to Git

2016-01-09 Thread Steve Davids
@Prasanna you can follow along with the SVN -> GIT history migration in 
https://issues.apache.org/jira/browse/LUCENE-6933 


@Erick for some basics you can checkout these interactive git guides, either 
https://www.codecademy.com/learn/learn-git 
 or http://gitreal.codeschool.com/ 
 though as Mark said if you find a decent UI 
you rarely need to use the command line. I’ve been fond of IntelliJ’s git 
support, but I have found Eclipse’s to be absolutely terrible (egit). 

-Steve

> On Jan 9, 2016, at 8:02 PM, Prasanna Dangalla  
> wrote:
> 
> 
> 
> On Sunday, 10 January 2016, Prasanna Dangalla  > wrote:
> Hi All,
> I'm a new member to this project. I was reading the mails previously. Thiught 
> leeHere if we migrate
> 
> mails previously. Thought of giving an input. Here if we migrate
> Sorry for the typo... 
> 
> from svn its better to migrate the history as well. I meant the commit 
> history. How do we migrate the SVN commit log from svn to git ? 
> 
> On Sunday, 10 January 2016, Mark Miller > wrote:
> I think we will update much of the doc as we go, but I'm sure there are 
> plenty of people that can help on the list with any questions. We can 
> probably get some basics up relatively painlessly. I'd guess the number of 
> committers that have not worked with Git yet is very small.
> 
> As a start, my recommendation would be to Google Git for SVN users and look 
> at some of those resources though. It's probably better than what we will 
> subset.
> 
> Personally, I like to just use SmartGit and mostly ignore command line Git :)
> 
> How have you been able to ignore GitHub for so long :)
> 
> Mark
> On Sat, Jan 9, 2016 at 6:13 PM Erick Erickson > 
> wrote:
> I'm a little confused. A while ago I asked about whether I had to
> learn all about Git, and as I remember the reply was "this is just
> about the build process". Perhaps I mis-interpreted or that was
> referring only to the bits Dawid was working on at that instant or
> 
> Anyway, assuming the SVN repo becomes read-only, that implies that all
> our commits need to happen in Git, right? There are still some "git
> challenged" curmudgeons out there (like me) who really haven't much of
> a clue. I'll figure it out, mind you but it'd be nice if there were a
> clear signal that "Now you have to figure it out because you can't
> commit to the SVN repo any more".
> 
> And the "how to contribute" page is all about SVN:
> https://wiki.apache.org/solr/HowToContribute 
> , if my understanding is
> at all close that page needs some significant editing.
> 
> Personally, before I screw up my first commit under Git, it would be
> super helpful if there were a step-by-step. No doubt that really
> amounts to three commands or something, but before "just trying stuff"
> it would be nice to have the steps for committing (pushing?) to trunk
> and then getting those changes into 5x (well, maybe 6.0 by then)
> outlined...
> 
> Or I'm off in the weeds here, always a possibility.
> 
> FWIW,
> Erick
> 
> 
> 
> On Sat, Jan 9, 2016 at 2:52 PM, Uwe Schindler > wrote:
> > Hi Mark,
> >
> >
> >
> > thanks for starting this! Looking forward to the whole process. When Infra
> > is about to “activate” the new GIT repo, I will take care of Policeman
> > Jenkins and fix the remaining validation tasks. I don’t want to do this. I
> > think your commit is fine.
> >
> >
> >
> > We now need some workflows how to merge between master/trunk and the release
> > branches. Projects do this in different ways (cherry-picking,…). I have no
> > preference or idea, sorry! I only know how to merge feature branches into
> > master J
> >
> >
> >
> > You mentioned that we should make the old svn read only. Maybe do it similar
> > like we did during LuSolr merge: Add a final commit removing everything from
> > trunk/branch_5x and leaving a readme.txt file in trunk and branch_5x
> > pointing to Git. All other branches stay alive. After that we could make it
> > read only – but it is not really needed. What do others think?
> >
> >
> >
> > Uwe
> >
> >
> >
> > -
> >
> > Uwe Schindler
> >
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> >
> > http://www.thetaphi.de 
> >
> > eMail: u...@thetaphi.de <>
> >
> >
> >
> > From: Mark Miller [mailto:markrmil...@gmail.com <>]
> > Sent: Saturday, January 09, 2016 10:55 PM
> > To: java-dev >
> > Subject: Moving Lucene / Solr from SVN to Git
> >
> >
> >
> > We have done almost all of the work necessary for a move and I have filed an
> > issue with INFRA.
> >
> >
> >
> > LUCENE-6937: Migrate Lucene project from SVN to Git.
> >
> > https://issues.apache.org/jira/browse/LUCENE-6937 

[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life

2015-11-19 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015143#comment-15015143
 ] 

Steve Davids commented on SOLR-7887:


I believe the best argument for logback is that it is a native implementation 
for SLF4j since it is developed by the same group. Though, from both a 
configuration and performance perspective the two are very similar. It does 
seem the logging .properties files have been frowned upon with the preferred 
configuration method being the xml configuration (log4j2 xml is pretty similar 
to the logback xml configuration).

This is a pretty useful tool to convert the existing log4j.properties files 
over to a logback.xml configuration: http://logback.qos.ch/translator/

So, the 
[log4j.properties|https://github.com/apache/lucene-solr/blob/639710b2958ed958f977c64a5fe3bbd5b0b0aa23/solr/server/resources/log4j.properties]
 translates to:
{code}










  
  
  
  

> Key: SOLR-7887
> URL: https://issues.apache.org/jira/browse/SOLR-7887
> Project: Solr
>  Issue Type: Task
>Affects Versions: 5.2.1
>Reporter: Shawn Heisey
>
> The logging services project has officially announced the EOL of log4j 1:
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> In the official binary jetty deployment, we use use log4j 1.2 as our final 
> logging destination, so the admin UI has a log watcher that actually uses 
> log4j and java.util.logging classes.  That will need to be extended to add 
> log4j2.  I think that might be the largest pain point to this upgrade.
> There is some crossover between log4j2 and slf4j.  Figuring out exactly which 
> jars need to be in the lib/ext directory will take some research.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life

2015-11-19 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015198#comment-15015198
 ] 

Steve Davids commented on SOLR-7887:


Looks like there is also a converter for log4j 1.x properties -> log4j 1.x xml 
here: http://log4j-props2xml.appspot.com/

Then perform the xml migration as defined here: 
https://logging.apache.org/log4j/2.x/manual/migration.html

> Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
> 
>
> Key: SOLR-7887
> URL: https://issues.apache.org/jira/browse/SOLR-7887
> Project: Solr
>  Issue Type: Task
>Affects Versions: 5.2.1
>Reporter: Shawn Heisey
>
> The logging services project has officially announced the EOL of log4j 1:
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> In the official binary jetty deployment, we use use log4j 1.2 as our final 
> logging destination, so the admin UI has a log watcher that actually uses 
> log4j and java.util.logging classes.  That will need to be extended to add 
> log4j2.  I think that might be the largest pain point to this upgrade.
> There is some crossover between log4j2 and slf4j.  Figuring out exactly which 
> jars need to be in the lib/ext directory will take some research.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-8313) Migrate to new slf4j logging implementation (log4j 1.x is EOL)

2015-11-18 Thread Steve Davids (JIRA)
Steve Davids created SOLR-8313:
--

 Summary: Migrate to new slf4j logging implementation (log4j 1.x is 
EOL)
 Key: SOLR-8313
 URL: https://issues.apache.org/jira/browse/SOLR-8313
 Project: Solr
  Issue Type: Improvement
  Components: Server
Reporter: Steve Davids


Log4j 1.x was declared dead (EOL) in August 2015: 
https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces

Solr should migrate to a new slf4j logging implementation, the popular choices 
these days seem to be either log4j2 or logback.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5748) Introduce autoManageReplicas collection property

2015-09-30 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939199#comment-14939199
 ] 

Steve Davids commented on SOLR-5748:


Due to the continued comments about an impending "ZK as truth" switch for the 
5.x branch, I went ahead and attempted to start using the Collection API and 
found it to be be a bit more burdensome than the classic mode. This particular 
ticket would go a long way to making the adding/removing replica process easy. 
I documented some of the annoyances in this thread: 
http://markmail.org/message/qungxgiab6njslpu

As for the previous comment: 
bq.  It would be good to have some kind of control over where the additional 
replicas will be so that the installation could decide that based on the disk 
space, memory availability etc.

That is now taken care of via SOLR-6220

> Introduce autoManageReplicas collection property
> 
>
> Key: SOLR-5748
> URL: https://issues.apache.org/jira/browse/SOLR-5748
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
> Fix For: Trunk
>
>
> I propose to introduce a collection property called autoManageReplicas. This 
> will be used only with the ZK as truth mode.
> If set to true, then whenever the number of replicas for a shard fall below 
> the replicationFactor and the down nodes do not come back up for a 
> configurable amount of time, additional replicas will be started up 
> automatically. Similarly, if the actual number of replicas is equal to or 
> greater than replicationFactor then if old (previously down) nodes come back 
> up then they will not be allowed to join the shard and will be unloaded 
> instead.
> I think we should not unload running shards if number of replicas are more 
> for now. We can change that later if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7577) Add support for rules in CREATESHARD and ADDREPLICA

2015-09-30 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936869#comment-14936869
 ] 

Steve Davids commented on SOLR-7577:


After looking at the test for this issue, why would anyone need to specify a 
shardName when a rule is available, doesn't that defeat the entire purpose of 
being smart with the replica placement via rules?

> Add support for rules in CREATESHARD and ADDREPLICA
> ---
>
> Key: SOLR-7577
> URL: https://issues.apache.org/jira/browse/SOLR-7577
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.2, Trunk
>
> Attachments: SOLR-7577.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7613) solrcore.properties file should be loaded if it resides in ZooKeeper

2015-09-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735665#comment-14735665
 ] 

Steve Davids commented on SOLR-7613:


I went ahead and swapped our {{solrcore.properties}} over to 
{{configoverlay.json}} and it worked like a champ. Using the API we had the 
chicken before the egg problem where the core wouldn't come up unless we had 
some properties specified but we couldn't specify the properties without having 
the core up and running. Thanks for the suggestion [~noble.paul], I think this 
ticket is safe to be withdrawn.

> solrcore.properties file should be loaded if it resides in ZooKeeper
> 
>
> Key: SOLR-7613
> URL: https://issues.apache.org/jira/browse/SOLR-7613
> Project: Solr
>  Issue Type: Bug
>        Reporter: Steve Davids
> Fix For: Trunk, 5.4
>
>
> The solrcore.properties file is used to load user defined properties for use 
> primarily in the solrconfig.xml file, though this properties file will only 
> load if it is resident in the core/conf directory on the physical disk, it 
> will not load if it is in ZK's core/conf directory. There should be a 
> mechanism to allow a core properties file to be specified in ZK and can be 
> updated appropriately along with being able to reload the properties when the 
> file changes (or via a core reload).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7831) Allow a configurable stack size (-Xss)

2015-07-24 Thread Steve Davids (JIRA)
Steve Davids created SOLR-7831:
--

 Summary: Allow a configurable stack size (-Xss)
 Key: SOLR-7831
 URL: https://issues.apache.org/jira/browse/SOLR-7831
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.3, Trunk


The Java stack size should be a configuration option in the solr.in.sh and 
solr.in.cmd instead of being set specifically within the startup script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7831) Allow a configurable stack size (-Xss)

2015-07-24 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-7831:
---
Attachment: SOLR-7831.patch

Added patch that preserves the previously set -Xss256k value in the appropriate 
solr.in.sh and solr.in.cmd files.

 Allow a configurable stack size (-Xss)
 --

 Key: SOLR-7831
 URL: https://issues.apache.org/jira/browse/SOLR-7831
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
  Labels: easyfix, patch
 Fix For: 5.3, Trunk

 Attachments: SOLR-7831.patch


 The Java stack size should be a configuration option in the solr.in.sh and 
 solr.in.cmd instead of being set specifically within the startup script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4907) Discuss and create instructions for taking Solr from the example to robust multi-server production

2015-07-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619695#comment-14619695
 ] 

Steve Davids commented on SOLR-4907:


I created a Solr RPM (yum) and DEB (apt-get) package builder here: 
https://github.com/sdavids13/solr-os-packager. It would be great if those 
packages can be built and pushed out with new Solr releases to make life a bit 
easier for clients to install and update to newer versions of Solr. The real 
meat is happening in the Gradle build file which uses Netflix's 
gradle-os-package plugin: 
https://github.com/sdavids13/solr-os-packager/blob/master/build.gradle.

 Discuss and create instructions for taking Solr from the example to robust 
 multi-server production
 --

 Key: SOLR-4907
 URL: https://issues.apache.org/jira/browse/SOLR-4907
 Project: Solr
  Issue Type: Improvement
Reporter: Shawn Heisey
 Attachments: SOLR-4907-install.sh


 There are no good step-by-step instructions for taking the Solr example and 
 producing a robust production setup on multiple servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr RPM DEB packages

2015-07-05 Thread Steve Davids
Solr has taken great strides with making it simpler for clients to install
and start using solr as a service, the install_solr_service.sh file is
quite handy for a brand new install. Unfortunately, the use of that script
is really a one time deal and won't let you use it for Solr upgrades nor
does it provide any cleanup mechanisms. In steps the RPM and DEB package
managers to make it even simpler for users to install/upgrade/remove
packages from their system. I went ahead and created an RPM  DEB builder
that mimics what the install_solr_service.sh does to generate the packages
(package builder can be run on any machine with Java installed), you can
check it out here: https://github.com/sdavids13/solr-os-packager. The
build.gradle
https://github.com/sdavids13/solr-os-packager/blob/master/build.gradle
file is the one orchestrating the package building (using Netflix's
gradle-ospackage-plguin), for instructions to build and test checkout the
README https://github.com/sdavids13/solr-os-packager/blob/master/README.md
.

Does anyone have any thoughts on the possibility of distributing said RPM
and DEB packages for new Solr code drops?

-Steve


Re: [jira] [Commented] (SOLR-7613) solrcore.properties file should be loaded if it resides in ZooKeeper

2015-06-05 Thread Steve Davids

 This is the right way to do properties in solrcloud


 https://cwiki.apache.org/confluence/display/solr/Config+API#ConfigAPI-CommandsforUser-DefinedProperties


In my particular case the core won't load without some of the properties
being specified. Is there a way to get those properties into ZK before you
even create the new collection? It looks like you are adding properties to
an already existing collection...

-Steve

On Fri, Jun 5, 2015 at 12:09 AM, Noble Paul noble.p...@gmail.com wrote:

 Replying here coz jira is down

 Let's get rid of solrcore.properties in cloud . We don't need it. It
 is not just reading that thing. We need to manage the lifecycle as
 well (editing, refreshing etc)


 This is the right way to do properties in solrcloud


 https://cwiki.apache.org/confluence/display/solr/Config+API#ConfigAPI-CommandsforUser-DefinedProperties

 On Fri, Jun 5, 2015 at 3:25 AM, Hoss Man (JIRA) j...@apache.org wrote:
 
  [
 https://issues.apache.org/jira/browse/SOLR-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573660#comment-14573660
 ]
 
  Hoss Man commented on SOLR-7613:
  
 
  Some relevant comments on this from the original mailing list
 discussion...
 
  Hoss..
 
  bq. IIUC CoreDescriptor.loadExtraProperties is the relevent method ...
 it would need to build up the path including the core name and get the
 system level resource loader (CoreContainer.getResourceLoader()) to access
 it since the core doesn't exist yet so there is no core level
 ResourceLoader to use.
 
  Alan...
 
  bq. I think this is an oversight, rather than intentional (at least, I
 certainly didn't intend to write it like this!). The problem here will be
 that CoreDescriptors are currently built entirely from core.properties
 files, and the CoreLocators that construct them don't have any access to
 zookeeper.
 
  bq. Maybe the way forward is to move properties out of CoreDescriptor
 and have an entirely separate CoreProperties object that is built and
 returned by the ConfigSetService, and that is read via the ResourceLoader.
 This would fit in quite nicely with the changes I put up on SOLR-7570, in
 that you could have properties specified on the collection config
 overriding properties from the configset, and then local core-specific
 properties overriding both.
 
  Hoss...
 
  bq. But they do have access to the CoreContainer which is passed to the
 CoreDescriptor constructor -- it has all the ZK access you'd need at the
 time when loadExtraProperties() is called.
 
  Alan...
 
  bq. Yeah, you could do it like that.  But looking at it further, I think
 solrcore.properties is actually being loaded in entirely the wrong place -
 it should be done by whatever is creating the CoreDescriptor, and then
 passed in as a Properties object to the CD constructor.  At the moment, you
 can't refer to a property defined in solrcore.properties within your
 core.properties file.
 
  Hoss...
 
  bq. but if you look at it fro ma historical context, that doesn't
 really  matter for the purpose that solrcore.properties was intended for --
 it  predates core discover, and was only intended as a way to specify
 user level properties that could then be substituted in the
 solrconfig.xml or dih.xml or schema.xml
 
  bq. ie: making it possible to use a solrcore.prop value to set a
 core.prop value might be a nice to have, but it's definitely not what it
 was intended for, so it shouldn't really be a blocker to getting the same
 (original) basic functionality working in SolrCloud.
 
  
 
  Honestly, even ignoring the historical context, it seems like a chicken
 and egg problem to me -- should it be possible to use a
 solrecore.properties variable to set the value of another variable in
 core.properties? or should it be possible to use a core.properties variable
 to set the value of another variable in solrcore.properties?
 
  the simplest thing for people to udnerstand would probably be to just
 say that they are independent, loaded seperately, and cause an error if you
 try to define the same value in both (i doubt that's currently enforced,
 but it probably should be)
 
  solrcore.properties file should be loaded if it resides in ZooKeeper
  
 
  Key: SOLR-7613
  URL: https://issues.apache.org/jira/browse/SOLR-7613
  Project: Solr
   Issue Type: Bug
 Reporter: Steve Davids
  Fix For: 5.3
 
 
  The solrcore.properties file is used to load user defined properties
 for use primarily in the solrconfig.xml file, though this properties file
 will only load if it is resident in the core/conf directory on the physical
 disk, it will not load if it is in ZK's core/conf directory. There should
 be a mechanism to allow a core properties file to be specified in ZK and
 can be updated appropriately along with being able

SearchComponent finishStage expectations

2015-06-03 Thread Steve Davids
I have a few custom SearchComponents that manipulate some data that is
being pulled out of the index in the finishStage. This approach has been
working well until I went to a single shard collection at which point the
components stopped functioning. After a bit of debugging I found the
HttpShardHandler decided to short circuit the request (unless I specify a
req param of shortCircuit=false) at which point the finishStage is
bypassed. Was the bypassing of the finishStage intentional for short
circuited requests? Should I be doing this data manipulation in the process
stage instead? If you all believe this is a bug I would be happy to write
up a ticket and take a crack at a fix. Additionally, it would be great to
write some javadocs for the overridable SearchComponent  methods so those
who create custom components in the future won't be surprised at the
behaviour.

Thanks,

-Steve


Re: Moving to git?

2015-05-31 Thread Steve Davids
There are also some rather large '.dat' files in the history as well, I
found this by running on a job to delete all blobs  5MB from the history
via:

$ java -jar ~/Downloads/bfg-1.12.3.jar --strip-blobs-bigger-than 5M
--protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror

 Deleted files
 -
 Filename Git id


 ---
 DoubleArrayTrie.dat| 8babf9fa (16.8 MB), f3bfe15b (16.8 MB),
 ...
 TokenInfoDictionary$buffer.dat | 25938b37 (7.0 MB), 7f02420f (7.1 MB), ...

 TokenInfoDictionary$trie.dat   | 69e76d64 (16.8 MB)

 dat.dat| 7445d1c8 (16.0 MB), 79bd7c8b (16.8 MB),
 37a215e5 (16.8 MB)
 europarl.lines.txt.gz  | e0366f10 (5.5 MB)

 tid.dat| 5a1e6199 (24.9 MB), 996d3fc5 (28.1 MB),
 ...
 tid_map.dat| 690fbea5 (6.3 MB), c1c01405 (6.3 MB),
 7a8c1420 (6.4 MB)
 wiki_results.txt   | db9e9294 (19.8 MB), 52ff9357 (19.8 MB),
 ...
 wiki_sentence.txt  | 3a38f62e (19.0 MB)

Dropping just those files reduced the repo by 50M, overall size is 131MB.

Note: there is one large file still in the trunk 5MB:

 * commit df1e3b32 (protected by 'trunk') - contains 1 dirty file :
 -
 lucene/test-framework/src/resources/org/apache/lucene/util/europarl.lines.txt.gz
 (5.5 MB)


Also, I failed to provide the numbers on what `git reflog expire
--expire=now --all  git gc --prune=now --aggressive` on a fresh mirror
checkout, it results in a repo size of 320M. So, dropping the old jars
saves 120MB.

-Steve

On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com 
david.w.smi...@gmail.com wrote:

 I like where this is going!

 I also think history of source code is very important, but not history of
 ‘.jar’ files that shouldn’t have been in source control in the first
 place.  I’m fiercely negative about large binaries or ‘jar’ files that can
 be downloaded by the build system (e.g. ivy) in source control.  And it was
 already mentioned a full history (.jar’s  all) could be kept somewhere
 more for archival purposes — which is a good compromise, I think, since
 “build-ability” of history should be retained (assuming it’s even still
 possible, given Rob’s comments) but doesn’t have to be convenient (e.g. by
 it being in a separate repo).   +1 to that!

 If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
 it’d be good to also streamline the history prior to the big Lucene + Solr
 merge due to the paths in source control as to where the trunk, branches,
 and tags lived.  It appears the current repo may have been a blind git
 import from subversion.  And hand-done process that is mindful of these
 things would result in a nice history.  I’ve done this sorta thing once (a
 project at my last job) and volunteer to do it here if we can get consensus
 on a move to git.

 ~ David

 On Sun, May 31, 2015 at 4:21 PM Dawid Weiss dawid.we...@cs.put.poznan.pl
 wrote:

  I'd like to have full consolidated history, as much as possible,
  connect-the-dots across whatever CVS/SVN/etc repos to the extent
  maximally permitted by law, as Doug hints at. Just nuke the jars.

 I've done this (CVS-SVN-GIT) before. It wasn't that difficult.
 Eventually (for git) you script it and it gets version after version
 from CVS or SVN and appends it to git. I admit I didn't care much
 about svn merging infos though. Any files can be removed/ pruned by
 rewriting git trees before they're published.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Steve Davids
bq. Something needs to be done about all those jars in the source history, I
will not let this go.

I went ahead and used the BFG Repo Cleaner
https://rtyley.github.io/bfg-repo-cleaner/ tool to drop all of the old
jars in the git history, here are the findings:

$ git clone --mirror https://github.com/apache/lucene-solr.git
lucene-solr-mirror

 489M lucene-solr-mirror

$ java -jar ~/Downloads/bfg-1.12.3.jar --delete-files *.jar
--protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror
$ cd lucene-solr-mirror
$ git reflog expire --expire=now --all  git gc --prune=now --aggressive

 182M lucene-solr-mirror

$ cat lucene-solr-mirror.bfg-report/2015-05-31/10-16-36/deleted-files.txt
af4eed0506b53f17a4d22e4f1630ee03cb7991e5 177868 Tidy.jar
53f82a1c4c492dc810c27317857bbb02afd6fa58 62983 activation-1.1.jar
3beb3b802ffd7502ac4b4d47e0b2a75d08e30cc3 1034049 ant-1.6.5.jar
704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 ant-1.7.1.jar
7f5be4a4e05939429353a90e882846aeac72b976 1933743 ant-1.8.2.jar
063cce4f940033fa6e33d3e590cf6f5051129295 93518 ant-junit-1.7.1.jar
704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 apache-ant-1.7.1.jar
063cce4f940033fa6e33d3e590cf6f5051129295 93518 apache-ant-junit-1.7.1.jar
e3c62523fb93b5e2f73365e6cee0d0bc68e48556 95511 apache-mime4j-core-0.7.jar
1f7bf1ea13697ca0243d399ca6e5d864dd8bec0b 300168 apache-mime4j-dom-0.7.jar
bab8b31fb99256e13fc6010701db560243c47fa7 26027
apache-solr-commons-csv-1.0-SNAPSHOT-r966014.jar
5c4007c7e74af85d823243153d308f80e084eff0 22478
apache-solr-noggit-r1099557.jar
f59a39b011591edafc7955e97ae0d195fdf8b42e 22376
apache-solr-noggit-r1209632.jar
2a07c61d9ecb9683a135b7847682e7c36f19bbfe 22770
apache-solr-noggit-r1211150.jar
30be80e0b838a9c1445936b6966ccfc7ff165ae5 36776
apache-solr-noggit-r730138.jar
97d779912d38d2524a0e20efa849a4b6f01a4b46 21229
apache-solr-noggit-r730138.jar
a798b805d0ce92606697cc1b2aac42bf416076e3 37259
apache-solr-noggit-r944541.jar
9b434f5760dd0d78350bdf8237273c0d5db0174e 21240
apache-solr-noggit-r944541.jar
8217cae0a1bc977b241e0c8517cc2e3e7cede276 43033 asm-3.1.jar
4133d823d96bf3fc26d3a9754375dcc30d8da416 342664 asm-debug-all-4.1.jar
f66e9a8b9868226121961c13e6a32a55d0b2f78a 229116 bcmail-jdk15-1.45.jar
409070b0370a95c14ed4357261afb96b91d10e86 1663318 bcprov-jdk15-1.45.jar
b64b033af70609338c07e2a88a5f7efcd1a84ddb 92027 boilerpipe-1.1.0.jar
96c3bdbdaacd5289b0e654842e435689fbcf22e2 679423 carrot2-core-3.4.0.jar
043c0cb889aea066f7d4126af029d00a0bcd9e81 655412 carrot2-core-3.4.0.jar
f872cbc8eec94f7d5b29a73f99cd13089848a3cd 933657 carrot2-core-3.4.2.jar
ce2d3bf9c28a4ff696d66a82334d15fd0161e890 995243 carrot2-core-3.4.2.jar
be94db93d41bd4ba53b650d421cfa5fb0519b9af 958799 carrot2-core-3.5.0.1.jar
adc127c48137d03e252f526de84a07c8d6bda521 979186 carrot2-core-3.5.0.jar
ab44cf9314b1efff393e05f9c938446887d3570e 981085 carrot2-core-3.5.0.jar
5ca86c5e72b2953feb0b58fbd87f76d0301cbbf6 517641 carrot2-mini-3.1.0.jar
b1b89c9c921f16af22a88db3ff28975a8e40d886 188671 commons-beanutils-1.7.0.jar
e633afbe6842aa92b1a8f0ff3f5b8c0e3283961b 36174 commons-cli-1.1.jar
957b6752af9a60c1bb2a4f65db0e90e5ce00f521 46725 commons-codec-1.3.jar
458d432da88b0efeab640c229903fb5aad274044 58160 commons-codec-1.4.jar
e9013fed78f333c928ff7f828948b91fcb5a92b4 73098 commons-codec-1.5.jar
ee1bc49acae11cc79eceec51f7be785590e99fd8 232771 commons-codec-1.6.jar
41e230feeaa53618b6ac5f8d11792c2eecf4d4fd 559366 commons-collections-3.1.jar
c35fa1fee145cba638884e41b80a401cbe4924ef 575389
commons-collections-3.2.1.jar
78d832c11c42023d4bc12077a1d9b7b5025217bc 143847 commons-compress-1.0.jar
51baf91a2df10184a8cca5cb43f11418576743a1 161361 commons-compress-1.1.jar
61753909c3f32306bf60d09e5345d47058ba2122 168596 commons-compress-1.2.jar
6c826c528b60bb1b25e9053b7f4c920292f6c343 224548 commons-compress-1.3.jar
f80348dfa0b59f0840c25d1b8c25d1490d1eaf51 22017
commons-csv-1.0-SNAPSHOT-r609327.jar
8439e6f1a8b1d82943f84688b8086869255eda86 27361
commons-csv-1.0-SNAPSHOT-r966014.jar
1783dbea232ced6db122268f8faa5ce773c7ea42 139966 commons-digester-1.7.jar
9c8bd13a2002a9ff5b35b873b9f111d5281ad201 148783 commons-digester-2.0.jar
aa209b3887c90933cdc58c8c8572e90435e8e48d 57779 commons-fileupload-1.2.1.jar
7c59774aed4f5dd08778489aaad565690ff7c132 305001 commons-httpclient-3.1.jar
133dc6cb35f5ca2c5920fd0933a557c2def88680 109043 commons-io-1.4.jar
b5c7d692fe5616af4332c1a1db6efd23e3ff881b 163151 commons-io-2.1.jar
ce0ca22c8d29a9be736d775fe50bfdc6ce770186 257923 commons-lang-2.4.jar
532939ecab6b77ccb77af3635c55ff9752b70ab7 261809 commons-lang-2.4.jar
98467d3a653ebad776ffa3542efeb9732fe0b482 284220 commons-lang-2.6.jar
b73a80fab641131e6fbe3ae833549efb3c540d17 38015 commons-logging-1.0.4.jar
1deef144cb17ed2c11c6cdcdcb2d9530fa8d0b47 60686 commons-logging-1.1.1.jar
ae0b63586701efdc7bf03ffb0a840d50950d211c 3566844 core-3.1.1.jar
b9c8c8a170881dfe9c33adc87c26348904510954 364003 cpptasks-1.0b5.jar
99baf20bacd712cae91dd6e4e1f46224cafa1a37 500676 db-4.7.25.jar
c8c4dbb92d6c23a7fbb2813eb721eb4cce91750c 313898 

[jira] [Created] (SOLR-7613) solrcore.properties file should be loaded if it resides in ZooKeeper

2015-05-30 Thread Steve Davids (JIRA)
Steve Davids created SOLR-7613:
--

 Summary: solrcore.properties file should be loaded if it resides 
in ZooKeeper
 Key: SOLR-7613
 URL: https://issues.apache.org/jira/browse/SOLR-7613
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.3


The solrcore.properties file is used to load user defined properties for use 
primarily in the solrconfig.xml file, though this properties file will only 
load if it is resident in the core/conf directory on the physical disk, it will 
not load if it is in ZK's core/conf directory. There should be a mechanism to 
allow a core properties file to be specified in ZK and can be updated 
appropriately along with being able to reload the properties when the file 
changes (or via a core reload).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7480) Allow AtomicUpdateDocumentMerger subclasses to override the doAdd method

2015-04-27 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-7480:
---
Attachment: SOLR-7480.patch

 Allow AtomicUpdateDocumentMerger subclasses to override the doAdd method
 

 Key: SOLR-7480
 URL: https://issues.apache.org/jira/browse/SOLR-7480
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: Trunk, 5.2

 Attachments: SOLR-7480.patch


 I had a slight oversight with the patch I provided on SOLR-6909 where I 
 didn't make the doAdd method on the AtomicUpdateDocumentMerger protected to 
 allow subclasses to override that specific implementation (oops). This is a 
 trivial change to allow clients to subclass and override this specific 
 implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7474) Remove protocol name from base_url in cluster state

2015-04-27 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516197#comment-14516197
 ] 

Steve Davids commented on SOLR-7474:


The base url should be generated by the node when it joins the cluster (or at 
least that's how it used to work), so  the sequence of events that you describe 
will work upon restart without having to touch the cluster state. The purpose 
of the SSLMigrationTest is to do just that - update ZK, restart all nodes, then 
verify the base_url was updated appropriately with the proper URL scheme.

 Remove protocol name from base_url in cluster state
 ---

 Key: SOLR-7474
 URL: https://issues.apache.org/jira/browse/SOLR-7474
 Project: Solr
  Issue Type: Wish
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
 Fix For: Trunk, 5.2


 In order to setup SSL, a user must add a cluster property which enables HTTPS 
 instead of HTTP. This property is used to create the base_url which is stored 
 for every node in the cluster.
 The above works fine if we assume that a user decides to enable SSL before 
 creating the cluster. However, if a user with an existing cluster wants to 
 start using SSL, he will need to manually edit his cluster state to switch 
 the protocol stored inside base_url for every node from http to https. If we 
 remove the protocol from the base_url, a user can shutdown the cluster, setup 
 the certificates, add the cluster property and start the cluster thereby 
 re-using the same cluster state which existed without manual modifications.
 Alternately, an extension to zkcli can be provided to change the cluster 
 state. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7480) Allow AtomicUpdateDocumentMerger subclasses to override the doAdd method

2015-04-27 Thread Steve Davids (JIRA)
Steve Davids created SOLR-7480:
--

 Summary: Allow AtomicUpdateDocumentMerger subclasses to override 
the doAdd method
 Key: SOLR-7480
 URL: https://issues.apache.org/jira/browse/SOLR-7480
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: Trunk, 5.2


I had a slight oversight with the patch I provided on SOLR-6909 where I didn't 
make the doAdd method on the AtomicUpdateDocumentMerger protected to allow 
subclasses to override that specific implementation (oops). This is a trivial 
change to allow clients to subclass and override this specific implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2015-04-23 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510329#comment-14510329
 ] 

Steve Davids commented on SOLR-4839:


Looks  good, though we might want to think about *not* reusing the 
javax.net.ssl.* for the jetty key/trust store configuration. I could think of a 
few cases where you might want to make the two different, ie one value for the 
client request and one value for the jetty connector, unless of course the 
recommendation is to only use self-signed certs for both client and server. 
Though, maybe the solr.in.sh could have something like:
{code}
SOLR_SSL_KEY_STORE=etc/solr-ssl.keystore.jks
SOLR_SSL_KEY_STORE_PASSWORD=secret
SOLR_SSL_TRUST_STORE=etc/solr-ssl.keystore.jks
SOLR_SSL_TRUST_STORE_PASSWORD=secret
 OVERRIDE PREVIOUSLY DEFINED SSL VALUES FOR HTTP CLIENT IF NECESSARY ##
#SOLR_SSL_CLIENT_KEY_STORE=
#SOLR_SSL_CLIENT_KEY_STORE_PASSWORD=
#SOLR_SSL_CLIENT_TRUST_STORE=
#SOLR_SSL_CLIENT_TRUST_STORE_PASSWORD=
{code}

Then the solr startup script can set the javax.net.ssl.* system properties for 
the client side + create something like jetty.ssl.truststore/keystore/etc on 
the jetty server side. This would allow a little bit more flexibility for 
people who might want to use a different certificate or trust store between the 
http client and server, though this really is getting more on a fringe  use 
case.

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: Trunk, 5.2

 Attachments: SOLR-4839-conform-jetty9_2_10.patch, 
 SOLR-4839-conform-jetty9_2_10.patch, SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-jetty9.2.10, SOLR-4839-mod-JettySolrRunner.patch, 
 SOLR-4839-ssl-support_patch.patch, SOLR-4839-ssl-support_patch.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2015-04-23 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510387#comment-14510387
 ] 

Steve Davids commented on SOLR-4839:


Also, if you want to use the javax.net.ssl.* stuff I believe you need to swap 
{code}Property name=javax.net.ssl.keyStore 
default=./etc/solr-ssl.keystore.jks/{code} with {code}SystemProperty 
name=javax.net.ssl.keyStore default=./etc/solr-ssl.keystore.jks/{code} 
(note the SystemProperty vs Property).

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: Trunk, 5.2

 Attachments: SOLR-4839-conform-jetty9_2_10.patch, 
 SOLR-4839-conform-jetty9_2_10.patch, SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-jetty9.2.10, SOLR-4839-mod-JettySolrRunner.patch, 
 SOLR-4839-ssl-support_patch.patch, SOLR-4839-ssl-support_patch.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2015-04-21 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506256#comment-14506256
 ] 

Steve Davids commented on SOLR-4839:


Any plans on porting this into the 5x branch? Also, do we know how this will 
behave with the fancy new startup scripts? It appears clients would need to 
configure a few things in both the start.in.sh file + the start.ini file as it 
is sitting right now. Additionally, here are a few potential issues I came 
across:
# jetty-ssl.xml
#* Has a bunch of properties that aren't prefixed with 'jetty' ie. 'ssl.port'  
'ssl.timeout' vs 'jetty.ssl.port'  'jetty.ssl.timeout'.
#* Keystore  Truststore path are always relative to `Property 
name=jetty.base default=. /` which could get annoying for some people if 
they want to specify an absolute path

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: Trunk, 5.2

 Attachments: SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-mod-JettySolrRunner.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2015-02-05 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307578#comment-14307578
 ] 

Steve Davids commented on SOLR-4839:


I was creating a new MiniSolrCluster test because I needed to have the ability 
to define multiple cores and I was never able to get the test to work via 
eclipse, traced it down to be this issue.

Sent from my iPhone



 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: Trunk, 5.1

 Attachments: SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-mod-JettySolrRunner.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4407) SSL Certificate based authentication for SolrCloud

2015-02-05 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308429#comment-14308429
 ] 

Steve Davids commented on SOLR-4407:


Sorry for not being more specific. Yes, the instructions does allow for 
specifying your own self-signed certificate and importing that specific 
certificate in a new trust store that will be loaded by the container - this 
will lock it down to the specific certificate. The modification that I have 
done is to create a custom servlet container to openly accept client 
certificates within an organization, perform an LDAP lookup (via cert DN) to 
pull groups then grant access if they are apart of a specific group. With this 
capability we are able to grant access via LDAP groups which is a preferred 
route of client authentication for our specific use-case. 

So, to answer your question:

bq. What aspect of SSL do you think isn't already configurable?

SSL is configurable via trust stores but mechanisms for a customizable 
certificate based authentication system isn't in place, such as the case above 
(get cert DN + user lookup via LDAP to authorize).

 SSL Certificate based authentication for SolrCloud
 --

 Key: SOLR-4407
 URL: https://issues.apache.org/jira/browse/SOLR-4407
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Sindre Fiskaa
Assignee: Steve Rowe
  Labels: Authentication, Certificate, SSL
 Fix For: 4.7, Trunk


 I need to be able to secure sensitive information in solrnodes running in a 
 SolrCloud with either SSL client/server certificates or http basic auth..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4407) SSL Certificate based authentication for SolrCloud

2015-02-04 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306490#comment-14306490
 ] 

Steve Davids commented on SOLR-4407:


Sorry for not replying back earlier. Yes, you can perform certificate based 
authentication through either built in servlet container mechanisms or custom 
servlet filters applied via Jetty's webdefault.xml file. So, for the time being 
it works, but if we move Solr away from users being able to customize their 
servlet containers (standalone app mode) then Solr will need to make this 
capability configurable somehow.

 SSL Certificate based authentication for SolrCloud
 --

 Key: SOLR-4407
 URL: https://issues.apache.org/jira/browse/SOLR-4407
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Sindre Fiskaa
Assignee: Steve Rowe
  Labels: Authentication, Certificate, SSL
 Fix For: 4.7, Trunk


 I need to be able to secure sensitive information in solrnodes running in a 
 SolrCloud with either SSL client/server certificates or http basic auth..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6449) Add first class support for Real Time Get in Solrj

2015-01-23 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290404#comment-14290404
 ] 

Steve Davids commented on SOLR-6449:


Cool, thanks Shalin. Just a side note, we could optimize retrievals a little 
for the Cloud client since it would have knowledge of which shard to route the 
traffic to (similar to how doc updates are handled) - perhaps a new follow-up 
ticket is in order just to let people know we thought about it :).

 Add first class support for Real Time Get in Solrj
 --

 Key: SOLR-6449
 URL: https://issues.apache.org/jira/browse/SOLR-6449
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
  Labels: difficulty-medium, impact-medium
 Fix For: Trunk, 5.1

 Attachments: SOLR-6449.patch, SOLR-6449.patch


 Any request handler can be queried by Solrj using a custom param map and the 
 qt parameter but I think /get should get first-class support in the java 
 client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2015-01-23 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290413#comment-14290413
 ] 

Steve Davids commented on SOLR-4839:


Anyone happen to see the same issue I was getting with the 
TestMiniSolrCloudCluster? I attached a patch but doesn't look like it was 
pulled in yet, just wondering if others were getting similar test failures.

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: Trunk, 5.1

 Attachments: SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-mod-JettySolrRunner.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6840) Remove legacy solr.xml mode

2015-01-14 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277067#comment-14277067
 ] 

Steve Davids commented on SOLR-6840:


Something doesn't seem right here, there is a static http client builder method 
that will generate an HttpClient instance in the correct state based on the 
system properties of the current actively running test. You shouldn't need to 
specify your own instance of HttpClient to build a Solr Client. I can take a 
look at this later tonight if you want, was there a particular test failure 
that I should hone in on?

 Remove legacy solr.xml mode
 ---

 Key: SOLR-6840
 URL: https://issues.apache.org/jira/browse/SOLR-6840
 Project: Solr
  Issue Type: Task
Reporter: Steve Rowe
Assignee: Erick Erickson
Priority: Blocker
 Fix For: 5.0

 Attachments: SOLR-6840.patch, SOLR-6840.patch, SOLR-6840.patch, 
 SOLR-6840.patch


 On the [Solr Cores and solr.xml 
 page|https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml],
  the Solr Reference Guide says:
 {quote}
 Starting in Solr 4.3, Solr will maintain two distinct formats for 
 {{solr.xml}}, the _legacy_ and _discovery_ modes. The former is the format we 
 have become accustomed to in which all of the cores one wishes to define in a 
 Solr instance are defined in {{solr.xml}} in 
 {{corescore/...core//cores}} tags. This format will continue to be 
 supported through the entire 4.x code line.
 As of Solr 5.0 this form of solr.xml will no longer be supported.  Instead 
 Solr will support _core discovery_. [...]
 The new core discovery mode structure for solr.xml will become mandatory as 
 of Solr 5.0, see: Format of solr.xml.
 {quote}
 AFAICT, nothing has been done to remove legacy {{solr.xml}} mode from 5.0 or 
 trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4839) Jetty 9

2015-01-10 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4839:
---
Attachment: SOLR-4839-mod-JettySolrRunner.patch

I found that this broke the 'TestMiniSolrCloudCluster' test (or anything that 
doesn't specify a 'jetty.testMode' system property). If a test doesn't specify 
the 'jetty.testMode' property a null pointer exception is thrown by jetty 
because a ServerConnector is attempting to be created with a null Server. I 
attached a patch to fix the specific issue, though I'm not quite sure why we 
need to branch the code - couldn't we consolidate the two?

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, Trunk

 Attachments: SOLR-4839-fix-eclipse.patch, 
 SOLR-4839-mod-JettySolrRunner.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch, 
 SOLR-4839.patch, SOLR-4839.patch, SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6449) Add first class support for Real Time Get in Solrj

2015-01-10 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6449:
---
Attachment: SOLR-6449.patch

Provided a simpler patch that doesn't create new GetByIdResponse  
GetByIdRequest classes. Also, added the ability to specify your own custom 
SolrParams (useful for specifying fields  cores in SolrCloud):

SolrDocument getById(String id)
SolrDocument getById(String id, SolrParams params)
SolrDocumentList getById(CollectionString ids)
SolrDocumentList getById(CollectionString ids, SolrParams params)

 Add first class support for Real Time Get in Solrj
 --

 Key: SOLR-6449
 URL: https://issues.apache.org/jira/browse/SOLR-6449
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Shalin Shekhar Mangar
  Labels: difficulty-medium, impact-medium
 Fix For: 5.0

 Attachments: SOLR-6449.patch, SOLR-6449.patch


 Any request handler can be queried by Solrj using a custom param map and the 
 qt parameter but I think /get should get first-class support in the java 
 client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2015-01-09 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6496:
---
Attachment: SOLR-6496.patch

Updated the patch to provide the same exiting functionality in the duplicate 
request implementation. I created SOLR-6949 to capture the refactoring that 
should be done to consolidate the two implementations.

 LBHttpSolrServer should stop server retries after the timeAllowed threshold 
 is met
 --

 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-6496.patch, SOLR-6496.patch, SOLR-6496.patch, 
 SOLR-6496.patch, SOLR-6496.patch, SOLR-6496.patch


 The LBHttpSolrServer will continue to perform retries for each server it was 
 given without honoring the timeAllowed request parameter. Once the threshold 
 has been met, you should no longer perform retries and allow the exception to 
 bubble up and allow the request to either error out or return partial results 
 per the shards.tolerant request parameter.
 For a little more context on how this is can be extremely problematic please 
 see the comment here: 
 https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
  (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6949) Refactor LBHttpSolrClient to consolidate the two different request implementations

2015-01-09 Thread Steve Davids (JIRA)
Steve Davids created SOLR-6949:
--

 Summary: Refactor LBHttpSolrClient to consolidate the two 
different request implementations
 Key: SOLR-6949
 URL: https://issues.apache.org/jira/browse/SOLR-6949
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk


LBHttpSolrClient has two duplicate request implementations:

1. public Rsp request(Req req) throws SolrServerException, IOException
2. public NamedListObject request(final SolrRequest request) throws 
SolrServerException, IOException

Refactor the client to provide a single implementation that both can use since 
they should be consistent and are non-trivial implementations which makes 
maintenance a bit more burdensome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2015-01-09 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270996#comment-14270996
 ] 

Steve Davids commented on SOLR-6496:


The LB Client has duplicate implementations defined in both:

1. public Rsp request(Req req) throws SolrServerException, IOException
2. public NamedListObject request(final SolrRequest request) throws 
SolrServerException, IOException

The original patch was only dealing with one of the two, we need to either a) 
copy the same code into the other or b) refactor the methods to have a single 
implementation that both methods call. Option B is my personal preference, 
though we might want to just do that in a separate ticket and go with option A 
to get it in as soon as possible. I can work on either tonight after I get back 
from work if anyone has a route they would like to go.

 LBHttpSolrServer should stop server retries after the timeAllowed threshold 
 is met
 --

 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-6496.patch, SOLR-6496.patch, SOLR-6496.patch, 
 SOLR-6496.patch, SOLR-6496.patch


 The LBHttpSolrServer will continue to perform retries for each server it was 
 given without honoring the timeAllowed request parameter. Once the threshold 
 has been met, you should no longer perform retries and allow the exception to 
 bubble up and allow the request to either error out or return partial results 
 per the shards.tolerant request parameter.
 For a little more context on how this is can be extremely problematic please 
 see the comment here: 
 https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
  (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6909) Allow pluggable atomic update merging logic

2015-01-08 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6909:
---
Attachment: SOLR-6909.patch

Updated patch to add a 'doSet' and 'doAdd' method which allows clients to 
override specific implementations of any atomic update command.

 Allow pluggable atomic update merging logic
 ---

 Key: SOLR-6909
 URL: https://issues.apache.org/jira/browse/SOLR-6909
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6909.patch, SOLR-6909.patch


 Clients should be able to introduce their own specific merging logic by 
 implementing a new class that will be used by the DistributedUpdateProcessor. 
 This is particularly useful if you require a custom hook to interrogate the 
 incoming document with the document that is already resident in the index as 
 there isn't the ability to perform that operation nor can you currently 
 extend the DistributedUpdateProcessor to provide the modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6909) Allow pluggable atomic update merging logic

2015-01-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270431#comment-14270431
 ] 

Steve Davids commented on SOLR-6909:


The javascript approach is interesting but would seem overly complex when you 
always want the merging logic to work a specific way all the time. 
Additionally, I have a user case where I download a document in an update 
processor, extract fields from downloaded content, and index that document. The 
interesting thing here is that if I can't download the document I set the doc's 
status to error, though this is only valid if a good document already exists in 
the index, so if an error doc is trying to be merged an exception is thrown and 
won't clobber the good document. As you can see with the approach taken in this 
ticket it allows you the added flexibility with a customizable 
AtomicUpdateDocumentMerger.

Another added benefit is that it cleans up the DistributedUpdateProcessor a 
little. One modification I might want to make is to the attached patch is to 
make a `doSet` and `doAdd` which would be allow overrides of each specific 
merge type.

 Allow pluggable atomic update merging logic
 ---

 Key: SOLR-6909
 URL: https://issues.apache.org/jira/browse/SOLR-6909
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6909.patch


 Clients should be able to introduce their own specific merging logic by 
 implementing a new class that will be used by the DistributedUpdateProcessor. 
 This is particularly useful if you require a custom hook to interrogate the 
 incoming document with the document that is already resident in the index as 
 there isn't the ability to perform that operation nor can you currently 
 extend the DistributedUpdateProcessor to provide the modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6909) Allow pluggable atomic update merging logic

2015-01-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270431#comment-14270431
 ] 

Steve Davids edited comment on SOLR-6909 at 1/9/15 2:28 AM:


The javascript approach is interesting but would seem overly complex when you 
always want the merging logic to work a specific way all the time. 
Additionally, I have a user case where I download a document in an update 
processor, extract fields from downloaded content, and index that document. The 
interesting thing here is that if I can't download the document I set the doc's 
status to error, though this is only valid if a good document *doesn't* already 
exists in the index, so if an error doc is trying to be merged on top of an 
existing document an exception is thrown and won't clobber the good document. 
As you can see with the approach taken in this ticket it allows you the added 
flexibility with a customizable AtomicUpdateDocumentMerger.

Another added benefit is that it cleans up the DistributedUpdateProcessor a 
little. One modification I might want to make is to the attached patch is to 
make a `doSet` and `doAdd` which would be allow overrides of each specific 
merge type.


was (Author: sdavids):
The javascript approach is interesting but would seem overly complex when you 
always want the merging logic to work a specific way all the time. 
Additionally, I have a user case where I download a document in an update 
processor, extract fields from downloaded content, and index that document. The 
interesting thing here is that if I can't download the document I set the doc's 
status to error, though this is only valid if a good document already exists in 
the index, so if an error doc is trying to be merged an exception is thrown and 
won't clobber the good document. As you can see with the approach taken in this 
ticket it allows you the added flexibility with a customizable 
AtomicUpdateDocumentMerger.

Another added benefit is that it cleans up the DistributedUpdateProcessor a 
little. One modification I might want to make is to the attached patch is to 
make a `doSet` and `doAdd` which would be allow overrides of each specific 
merge type.

 Allow pluggable atomic update merging logic
 ---

 Key: SOLR-6909
 URL: https://issues.apache.org/jira/browse/SOLR-6909
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6909.patch


 Clients should be able to introduce their own specific merging logic by 
 implementing a new class that will be used by the DistributedUpdateProcessor. 
 This is particularly useful if you require a custom hook to interrogate the 
 incoming document with the document that is already resident in the index as 
 there isn't the ability to perform that operation nor can you currently 
 extend the DistributedUpdateProcessor to provide the modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6909) Allow pluggable atomic update merging logic

2015-01-04 Thread Steve Davids (JIRA)
Steve Davids created SOLR-6909:
--

 Summary: Allow pluggable atomic update merging logic
 Key: SOLR-6909
 URL: https://issues.apache.org/jira/browse/SOLR-6909
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk


Clients should be able to introduce their own specific merging logic by 
implementing a new class that will be used by the DistributedUpdateProcessor. 
This is particularly useful if you require a custom hook to interrogate the 
incoming document with the document that is already resident in the index as 
there isn't the ability to perform that operation nor can you currently extend 
the DistributedUpdateProcessor to provide the modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6909) Allow pluggable atomic update merging logic

2015-01-04 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6909:
---
Attachment: SOLR-6909.patch

Attached a patch which pulls the current merging implementation out from the 
DistributedUpdateProcessor into a new AtomicUpdateDocumentMerger class. This 
DistributedUpdateProcessorFactory instantiates a new AtomicUpdateDocumentMerger 
and passes it to the DistributedUpdateProcessor. This approach allows clients 
to extend the DistributedUpdateProcessorFactory and instantiate their own 
custom AtomicUpdateDocumentMerger which is then passed along to the 
DistributedUpdateProcessor. One thing that I'm not thrilled about is having a 
static 'isAtomicUpdate' method (currently in the code), I tried to remove the 
static but a couple other classes require that static method to be there and 
having a merger member variable didn't quite make sense in those cases so I 
left it a static.

 Allow pluggable atomic update merging logic
 ---

 Key: SOLR-6909
 URL: https://issues.apache.org/jira/browse/SOLR-6909
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6909.patch


 Clients should be able to introduce their own specific merging logic by 
 implementing a new class that will be used by the DistributedUpdateProcessor. 
 This is particularly useful if you require a custom hook to interrogate the 
 incoming document with the document that is already resident in the index as 
 there isn't the ability to perform that operation nor can you currently 
 extend the DistributedUpdateProcessor to provide the modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6735) CloneFieldUpdateProcessorFactory should be null safe

2015-01-03 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263734#comment-14263734
 ] 

Steve Davids commented on SOLR-6735:


Anyone willing to commit this? With the attached patch any null value is 
ignored, an alternative approach is to preserve the null by adding it to the 
destination field. Regardless the approach, it should be null safe.

 CloneFieldUpdateProcessorFactory should be null safe
 

 Key: SOLR-6735
 URL: https://issues.apache.org/jira/browse/SOLR-6735
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6735.patch


 If a source field value is null the CloneFieldUpdateProcessor throws a null 
 pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2014-12-19 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254329#comment-14254329
 ] 

Steve Davids commented on SOLR-4839:


bq. Jetty 9 has this concept of modules which can be configured via property 
files and/or by xml. We could do away with XML configuration if we want.

I actually like the module concept but I'm not sure how much of that you are 
going to want to bundle in Solr itself (copying module files).

Let me know if you want a hand with any of the TODOs.

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, Trunk

 Attachments: SOLR-4839.patch, SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4839) Jetty 9

2014-12-18 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252919#comment-14252919
 ] 

Steve Davids commented on SOLR-4839:


The jetty.xml file is going to need to change:

 bq. Prior to Jetty 9, the type of the connector specified both the protocol 
and the implementation used (for example, selector-based non blocking I/O vs 
blocking I/O, or SSL connector vs non-SSL connector). Jetty 9 has only a 
selector-based non blocking I/O connector, and a collection of 
ConnectionFactories now configure the protocol on the connector.

http://www.eclipse.org/jetty/documentation/current/configuring-connectors.html#jetty-connectors

 Jetty 9
 ---

 Key: SOLR-4839
 URL: https://issues.apache.org/jira/browse/SOLR-4839
 Project: Solr
  Issue Type: Improvement
Reporter: Bill Bell
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, Trunk

 Attachments: SOLR-4839.patch


 Implement Jetty 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Querying locally before sending a distributed request

2014-12-10 Thread Steve Davids
bq. In a one-shard case, no query really needs to be forwarded, since any
replica can fully get the results so in this case no query would be
forwarded.

You can pass the request param distrib=false to not distribute the request
in that particular case at which point it will only gather results from
that particular host.

As for the SolrCloud example with n-shards  1 your overall search request
time is limited to the slowest shard's response time. So, you would
potentially be saving one hop, but you are still making n-1 other hops to
gather all of the other shard's results thus making it a moot point since
you will be waiting on the other shards to respond before you can return
the aggregated result list. You will then be on the hook to setup the load
balancing across replicas of that one particular host you have chosen to
query as Erick said which could have some gotchyas for people not expecting
that behavior.

-Steve

On Wed, Dec 10, 2014 at 9:26 AM, Erick Erickson erickerick...@gmail.com
wrote:

 Just skimming, but if I'm reading this right, your suggestion is
 that queries be served locally rather than being forwarded to
 another replica when possible.

 So let's take the one-shard case with N replicas to make sure
 I understand. In a one-shard case, no query really needs to
 be forwarded, since any replica can fully get the results so
 in this case no query would be forwarded.

 If this is a fair summary, then consider the situation where the
 outside world connects to a single server rather than to a
 fronting load balancer. Then only one shard would be doing
 any work

 Or am I off in the weeds?

 That aside, if I've gotten it wrong and you want to put
 up a patch (or even just outline a better approach),
 feel free to open a JIRA and attach a patch...

 Best,
 Erick

 On Tue, Dec 9, 2014 at 11:55 PM, S G sg.online.em...@gmail.com wrote:
  Hello Solr Devs,
 
  I am a developer using Solr and wanted to have some opinion on a
 performance
  change request.
 
  Currently, I see that code flow for a query in SolrCloud is as follows:
 
  For distributed query:
  SolrCore - SearchHandler.handleRequestBody() -
 HttpShardHandler.submit()
 
  For non-distributed query:
  SolrCore - SearchHandler.handleRequestBody() - QueryComponent.process()
 
 
  For a distributed query, the request is always sent to all the shards
 even
  if the originating SolrCore (handling the original distributed query) is
 a
  replica of one of the shards.
  If the original Solr-Core can check itself before sending http requests
 for
  any shard, we can probably save some network hopping and gain some
  performance.
 
  If this idea seems feasible, I can submit a JIRA ticket and work on it.
  I am planning to change SearchHandler.handleRequestBody() or
  HttpShardHandler.submit()
 
  Thanks
  SG
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-6625) HttpClient callback in HttpSolrServer

2014-12-09 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239541#comment-14239541
 ] 

Steve Davids commented on SOLR-6625:


bq. See SOLR-4470. That makes use of SolrRequest.getPreemptiveAuthentication, 
so you'd need the actual SolrRequest

I took a look at the patch but not quite sure any of that is actually 
necessary. Looking at the what detailed in SOLR-4470 they need to be able to 
lock down Solr Cloud via BasicAuth, you can easily do this via HttpClient's 
[BasicScheme| 
http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/auth/BasicScheme.html]
 authentication scheme. Likewise you can see all of the various other 
[authentication schemes| 
http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/auth/AuthScheme.html]
 HttpClient supports (SPNego included). This would seem to do the trick, except 
for perhaps propagating the authentication from the original request though 
this shouldn't really be a requirement since the challenge will be static in 
the web container that you can live with having static credentials in the solr 
distrib - if it changes deploy new config changes.

For further information on authentication via HttpClient check out their help 
page here: 
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/authentication.html

bq. See the discussion above about SolrDispatchFilter. In that case, the client 
needs to get the context of the SolrQueryRequest... For my case, in the 
SolrDispatchFilter, I need to get some information from the SolrQueryRequest or 
HttpServletRequest (basically, the authenticated user that's available in the 
HttpServletRequest.getAttribute or SolrQueryRequest.getContext() or 
SolrQueryRequest.getParams())

Are you using these credentials to execute distributed requests? Or would it 
make sense to have a certain user hit the frontend shard, then that shard will 
perform the distributed request on behalf of the system's authentication 
credentials?

 HttpClient callback in HttpSolrServer
 -

 Key: SOLR-6625
 URL: https://issues.apache.org/jira/browse/SOLR-6625
 Project: Solr
  Issue Type: Improvement
  Components: SolrJ
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Attachments: SOLR-6625.patch, SOLR-6625.patch, SOLR-6625.patch, 
 SOLR-6625.patch, SOLR-6625.patch, SOLR-6625.patch


 Some of our setups use Solr in a SPNego/kerberos setup (we've done this by 
 adding our own filters to the web.xml).  We have an issue in that SPNego 
 requires a negotiation step, but some HttpSolrServer requests are not 
 repeatable, notably the PUT/POST requests.  So, what happens is, 
 HttpSolrServer sends the requests, the server responds with a negotiation 
 request, and the request fails because the request is not repeatable.  We've 
 modified our code to send a repeatable request beforehand in these cases.
 It would be nicer if HttpSolrServer provided a pre/post callback when it was 
 making an httpclient request.  This would allow administrators to make 
 changes to the request for authentication purposes, and would allow users to 
 make per-request changes to the httpclient calls (i.e. modify httpclient 
 requestconfig to modify the timeout on a per-request basis).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6625) HttpClient callback in HttpSolrServer

2014-12-08 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238865#comment-14238865
 ] 

Steve Davids commented on SOLR-6625:


Is there any reason why we wouldn't just utilize HttpClient's 
[HttpRequestInterceptor| 
http://hc.apache.org/httpcomponents-core-4.3.x/httpcore/apidocs/org/apache/http/HttpRequestInterceptor.html]
 and [HttpResponseInterceptor| 
http://hc.apache.org/httpcomponents-core-4.3.x/httpcore/apidocs/org/apache/http/HttpResponseInterceptor.html]?
 It seems as though if we could just provide an HttpClientFactory that clients 
can implement/override it should provide enough hooks to have everyone 
customize HttpClient to their hearts delight.

 HttpClient callback in HttpSolrServer
 -

 Key: SOLR-6625
 URL: https://issues.apache.org/jira/browse/SOLR-6625
 Project: Solr
  Issue Type: Improvement
  Components: SolrJ
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Attachments: SOLR-6625.patch, SOLR-6625.patch, SOLR-6625.patch, 
 SOLR-6625.patch, SOLR-6625.patch, SOLR-6625.patch


 Some of our setups use Solr in a SPNego/kerberos setup (we've done this by 
 adding our own filters to the web.xml).  We have an issue in that SPNego 
 requires a negotiation step, but some HttpSolrServer requests are not 
 repeatable, notably the PUT/POST requests.  So, what happens is, 
 HttpSolrServer sends the requests, the server responds with a negotiation 
 request, and the request fails because the request is not repeatable.  We've 
 modified our code to send a repeatable request beforehand in these cases.
 It would be nicer if HttpSolrServer provided a pre/post callback when it was 
 making an httpclient request.  This would allow administrators to make 
 changes to the request for authentication purposes, and would allow users to 
 make per-request changes to the httpclient calls (i.e. modify httpclient 
 requestconfig to modify the timeout on a per-request basis).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6741) IPv6 Field Type

2014-12-01 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6741:
---
Attachment: SOLR-6741.patch

I attached a patch for IPv4 support which allows a prefix query, range queries, 
and CIDR notation which extends a TrieLongField. Hopefully this can serve as a 
good starting point. [~lnr0626] was also a contributor for this code.

 IPv6 Field Type
 ---

 Key: SOLR-6741
 URL: https://issues.apache.org/jira/browse/SOLR-6741
 Project: Solr
  Issue Type: Improvement
Reporter: Lloyd Ramey
 Attachments: SOLR-6741.patch


 It would be nice if Solr had a field type which could be used to index IPv6 
 data and supported efficient range queries. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Where is the SVN repository only for Lucene project ?

2014-11-26 Thread Steve Davids
http://lucene.apache.org/core/developer.html

Sent from my iPhone

 On Nov 26, 2014, at 4:19 AM, Yosuke Yamatani 
 s151...@center.wakayama-u.ac.jp wrote:
 
 Dear sir/madam
 
 Hello, I’m Yosuke Yamatani.
 I’m a graduate student at Wakayama University, Japan.
 I study software evolution in OSS projects through the analysis of SVN
 repositories.
 I found the entire ASF repository, but I would like to mirror the SVN
 repository only for your project.
 Could you let me know how to get your repository ?
 
 Sincerely yours.
 Yosuke
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


[jira] [Created] (SOLR-6735) CloneFieldUpdateProcessorFactory should be null safe

2014-11-12 Thread Steve Davids (JIRA)
Steve Davids created SOLR-6735:
--

 Summary: CloneFieldUpdateProcessorFactory should be null safe
 Key: SOLR-6735
 URL: https://issues.apache.org/jira/browse/SOLR-6735
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk


If a source field value is null the CloneFieldUpdateProcessor throws a null 
pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6735) CloneFieldUpdateProcessorFactory should be null safe

2014-11-12 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6735:
---
Attachment: SOLR-6735.patch

Attached a trivial patch.

 CloneFieldUpdateProcessorFactory should be null safe
 

 Key: SOLR-6735
 URL: https://issues.apache.org/jira/browse/SOLR-6735
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6735.patch


 If a source field value is null the CloneFieldUpdateProcessor throws a null 
 pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4587) Implement Saved Searches a la ElasticSearch Percolator

2014-11-09 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204028#comment-14204028
 ] 

Steve Davids commented on SOLR-4587:


I believe we are confusing what Luwak *is* - Luwak is just an optimized 
matching algorithm which really belongs in the Lucene package rather than the 
Solr package. Since this ticket is centered around Solr's implementation of the 
percolator this more has to deal with the registration of queries and 
providing an API to stream back saved search query ids back to the client that 
matched a particular document. From a black box perspective that external 
interface (Solr HTTP API) should be rather simple, though the internal workings 
could be marked as experimental and can be swapped out for better 
implementations in the future.

 Implement Saved Searches a la ElasticSearch Percolator
 --

 Key: SOLR-4587
 URL: https://issues.apache.org/jira/browse/SOLR-4587
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other, SolrCloud
Reporter: Otis Gospodnetic
 Fix For: Trunk


 Use Lucene MemoryIndex for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4587) Implement Saved Searches a la ElasticSearch Percolator

2014-11-07 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202936#comment-14202936
 ] 

Steve Davids commented on SOLR-4587:


I agree that the Luwak approach provides clever performance optimizations by 
removing unnecessary queries upfront. Though, Luwak doesn't really solve 
providing percolator-like functionality, just provides an optimized matching 
algorithm. There is a decent amount of work here to allow clients to register 
queries in a Solr cluster and provide an API to pass a document and have it get 
matched against registered queries in a distributed manor, none of which is 
handled by Luwak. I personally believe this ticket can be implemented without 
Luwak's optimizations and provide value. We could provide a usage caveat that 
you might not want to register more than 20k queries per shard or so, or if 
they want to register more queries they can shard out their profiling/matcher 
collection to take advantage of additional hardware. We can provide an initial 
implementation then optimize the matching once Luwak dependencies are 
completed, but from an outside-in perspective the API would remain the same but 
matching would just be faster at a future point. 

Does it make sense to others to start with an initial approach then provide 
optimizations in future releases just as long as the API remains the same?

 Implement Saved Searches a la ElasticSearch Percolator
 --

 Key: SOLR-4587
 URL: https://issues.apache.org/jira/browse/SOLR-4587
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other, SolrCloud
Reporter: Otis Gospodnetic
 Fix For: Trunk


 Use Lucene MemoryIndex for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4587) Implement Saved Searches a la ElasticSearch Percolator

2014-11-05 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199868#comment-14199868
 ] 

Steve Davids commented on SOLR-4587:


I don't think Luwak is really an implementation of this particular feature. It 
does perform percolating functionality but as a stand-alone library which isn't 
integrated into Solr. May I suggest that we take a stab at this without waiting 
around for Luwak since the implementation is dependent on LUCENE-2878 which 
seems to keep stalling over and over again. The initial approach can take the 
naive loop across all queries for each document request and at a later point 
the Luwak approach can be incorporated to provide some nice optimizations. Here 
are some initial thoughts on acceptance criteria / what can be done to 
incorporate this functionality into solr:

# Able to register a query within a separate Solr core
#* Should take advantage of Solr's sharding ability in Solr Cloud
#* This can piggy-back off of the standard SolrInputDocument semantics with 
adding/deleting to perform query registration/deregistration.
#* Schema would define various fields for the stored query: q, fq, defType, etc.
# Able to specify which query parser should be used when matching docs 
(persisted w/ query)
# Able to specify the other core that the document should be profiled against 
(this can be at request time if you would like to profile against multiple 
shards)
#* Allows the profiling to know the fields, analysis chain, etc
# Should allow queries to be cached in RAM so they don't need to be re-parsed 
continually
# Custom response handler (perhaps a subclass of the search handler) should 
make a distributed request to all shards to gather all matching query profile 
ids and return to the client.

This is one of those features that would provide a lot of value to users and 
would be fantastic if we can get incorporated sooner rather than later.

 Implement Saved Searches a la ElasticSearch Percolator
 --

 Key: SOLR-4587
 URL: https://issues.apache.org/jira/browse/SOLR-4587
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other, SolrCloud
Reporter: Otis Gospodnetic
 Fix For: Trunk


 Use Lucene MemoryIndex for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6449) Add first class support for Real Time Get in Solrj

2014-10-18 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176051#comment-14176051
 ] 

Steve Davids commented on SOLR-6449:


This issue is really providing yet another convenience method to perform CRUD 
operations.

Create  Update Operations:
{code}
UpdateResponse resp = solrServer.add(SolrInputDocument);
UpdateResponse resp = solrServer.add(CollectionSolrInputDocument);
//+ a couple variants
{code}

These methods don't necessarily need to align to the various REST HTTP method 
semantics. The add action will perform a create or update clobbering the 
document already in place with the ability to perform an atomic update 
operation which will perform a merge with the document already in the index.

Delete Operations:
{code}
UpdateResponse resp = solrServer.deleteById(String id);
UpdateResponse resp = solrServer.deleteById(CollectionString id);
UpdateResponse resp = solrServer.deleteByQuery(String query);
//+ a couple variants
{code}

Read Operations:
{code}
QueryResponse resp = solrServer.query(SolrParams);
//+ a couple variants
{code}

As you can see the delete operation allows you to delete given a specific id or 
delete by a query, whereas the retrieval only gives you query access. To be 
consistent this ticket should provide the ability to retrieve by id as a 
convenience to developers using the SolrJ API (not to mention the additional 
benefits they will get from the RealTimeGetHandler).

 Add first class support for Real Time Get in Solrj
 --

 Key: SOLR-6449
 URL: https://issues.apache.org/jira/browse/SOLR-6449
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Shalin Shekhar Mangar
  Labels: difficulty-medium, impact-medium
 Fix For: 5.0


 Any request handler can be queried by Solrj using a custom param map and the 
 qt parameter but I think /get should get first-class support in the java 
 client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6449) Add first class support for Real Time Get in Solrj

2014-10-17 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175587#comment-14175587
 ] 

Steve Davids commented on SOLR-6449:


The current way of accessing the Real Time Get in SolrJ you need to do 
something along the lines of:

{code}
SolrDocument doc = solrServer.query(params(qt, /get, id, 
id)).getResponse().get(doc);
{code}

It would be convenient to provide a native method of (or similar variant):
{code}
SolrDocument doc = solrServer.get(id);
{code}

 Add first class support for Real Time Get in Solrj
 --

 Key: SOLR-6449
 URL: https://issues.apache.org/jira/browse/SOLR-6449
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Shalin Shekhar Mangar
  Labels: difficulty-medium, impact-medium
 Fix For: 5.0


 Any request handler can be queried by Solrj using a custom param map and the 
 qt parameter but I think /get should get first-class support in the java 
 client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Highlighters, accurate highlighting, and the PostingsHighlighter

2014-10-09 Thread Steve Davids
This is music to my ears, I came across the same highlighter issues and had a 
little discussion here: 

http://lucene.markmail.org/thread/ppunujq3hjjzq3z7#query:+page:1+mid:b6jweck6b6m2k4n4+state:results

Unfortunately I didn’t make much progress on it.

-Steve

On Oct 10, 2014, at 12:38 AM, david.w.smi...@gmail.com wrote:

 I’m working on making highlighting both accurate and fast.  By “accurate”, I 
 mean the highlights need to accurately reflect a match given the query and 
 various possible query types (to include SpanQueries and MultiTermQueries and 
 obviously phrase queries and the usual suspects).  The fastest highlighter 
 we’ve got in Lucene is the PostingsHighlighter but it throws out any 
 positional nature in the query and can highlight more inaccurately than the 
 other two highlighters. The most accurate is the default highlighter, 
 although I can see some simplifications it makes that could lead to 
 inaccuracies.
 
 The default highlighter’s “WeightedSpanTermExtractor” is interesting — it 
 uses a MemoryIndex built from re-analyzing the text, and it executes the 
 query against this mini index; kind of.  A recent experiment I did was to 
 have the MemoryIndex essentially wrap the “Terms” from term vectors.  It 
 works and saves memory, although, at least for large docs (which I’m 
 optimizing for) the real performance hit is in un-inverting the TokenStream 
 in TokenSources to include sorting the thousands of tokens -- assuming you 
 index term vectors of course.  But with my attention now on the 
 PostingsHighlighter (because it’s the fastest and offsets are way cheaper 
 than term vectors), I believe WeightedSpanTermExtractor could simply use 
 Lucene’s actual IndexReader — no?  It seems so obvious to me now I wonder why 
 it wasn’t done this way in the first place — all WSTE has to do is advance() 
 to the document being highlighted for applicable terms.  Am I overlooking 
 something?
 
 WeightedSpanTermExtractor is somewhat accurate but my reading of its source 
 shows it takes short-cuts I’d like to eliminate.  For example if the query is 
 “(A  B) || (C  D)” and if the document doesn’t have ‘D’ then it should 
 ideally NOT highlight ‘C’ in this document, just ‘A’ and ‘B’.  I think I can 
 solve that using Scorers.getChildScorers to see which scorers (and thus 
 queries) actually matched.  Another example is that it views SpanQueries at 
 the top level only and records the entire span for all terms it is comprised 
 of.  So if you had a couple Phrase SpanQueries (actually ordered 0-slop 
 SpanNearQueries) joined by a SpanNearQuery to be within ~50 positions of each 
 other, I believe it would highlight any other occurrence of the words 
 involved in-between the sub-SpanQueries. This looks hard to solve but I think 
 for starters, SpanScorer needs a getter for the Spans instance, and 
 furthermore Spans needs getChildSpans() just as Scorers expose child scorers. 
  I could see myself relaxing this requirement because of it’s complexity and 
 simply highlighting the entire span, even if it could be a big highlight.
 
 Perhaps the “Nuke Spans” effort might make this all much easier but I haven’t 
 looked yet because that’s still not done yet.  It’s encouraging to see Alan 
 making recent progress there.
 
 Any thoughts about any of this, guys?
 
 p.s. When I’m done, I expect to have no problem getting open-source 
 permission from the sponsor commissioning this effort.
 
 ~ David Smiley
 Freelance Apache Lucene/Solr Search Consultant/Developer
 http://www.linkedin.com/in/davidwsmiley



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2014-10-04 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14159212#comment-14159212
 ] 

Steve Davids commented on SOLR-5986:


Why wouldn't it return partial results? When sending a distributed request if 
all but one return results but one shard lags behind at query expansion one 
would think that you would get the appropriate partial results message. Unless 
this is partially related to SOLR-6496 which would retry a different replica in 
the shard group and thus *could* cause a timeout at the Solr distributed 
aggregation layer.

 Don't allow runaway queries from harming Solr cluster health or search 
 performance
 --

 Key: SOLR-5986
 URL: https://issues.apache.org/jira/browse/SOLR-5986
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
 SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
 SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
 SOLR-5986.patch


 The intent of this ticket is to have all distributed search requests stop 
 wasting CPU cycles on requests that have already timed out or are so 
 complicated that they won't be able to execute. We have come across a case 
 where a nasty wildcard query within a proximity clause was causing the 
 cluster to enumerate terms for hours even though the query timeout was set to 
 minutes. This caused a noticeable slowdown within the system which made us 
 restart the replicas that happened to service that one request, the worst 
 case scenario are users with a relatively low zk timeout value will have 
 nodes start dropping from the cluster due to long GC pauses.
 [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
 BLUR-142 (see commit comment for code, though look at the latest code on the 
 trunk for newer bug fixes).
 Solr should be able to either prevent these problematic queries from running 
 by some heuristic (possibly estimated size of heap usage) or be able to 
 execute a thread interrupt on all query threads once the time threshold is 
 met. This issue mirrors what others have discussed on the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6576) ModifiableSolrParams#add(SolrParams) is performing a set operation instead of an add

2014-10-01 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155314#comment-14155314
 ] 

Steve Davids commented on SOLR-6576:


Oops, I glanced at the code too fast last night and saw the suppress deprecated 
warning:

{code}
@SuppressWarnings({deprecation})
public static SolrParams wrapAppended(SolrParams params, SolrParams defaults) {
{code}

which I mistook as @deprecated for some reason, my bad.

 ModifiableSolrParams#add(SolrParams) is performing a set operation instead of 
 an add
 

 Key: SOLR-6576
 URL: https://issues.apache.org/jira/browse/SOLR-6576
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6576.patch


 Came across this bug by attempting to append multiple ModifiableSolrParam 
 objects together but found the last one was clobbering the previously set 
 values. The add operation should append the values to the previously defined 
 values, not perform a set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr atomic updates: default values

2014-10-01 Thread Steve Davids
I was curious if it would make sense to provide a default command for
atomic updates. Here is the use-case I am looking at... I want to keep
track of the indexed-on date which is a snapshot in time of when that
particular document was indexed for the first time. Currently I have that
value stored and the value is set to NOW as the default value in the
schema. Now, I want to actually set this value in the update chain prior to
the distributed index request so all replicas obtain the exact same value.
Unfortunately there isn't a way to specify use this new NOW date *only*
if the value hasn't been indexed prior, so I was thinking that this can be
simply handled by a default atomic update key that would only apply the
value if the document that is to being merged doesn't already have a value
specified. In addition to the validity of that thought, I was wondering if
people would find it beneficial to allow sub-classing of the
DistributedUpdateProcessorFactory (currently unable to due to inner
classes) or at a minimum allow clients to specify their own merge logic
implementation if they see fit.

I would be happy to provide some patches if people think these are
reasonable use-cases.

Thanks,

-Steve


Re: Solr atomic updates: default values

2014-10-01 Thread Steve Davids
Quick clarification, when I say you can't sub-class the
DistributedUpdateProcessor that is in the context of attempting to override
the default merging functionality in the getUpdatedDocument method (which
is package-private and not overridable from a solr-plugins classpath
loader) but also code along the way uses the private constructor of the
inner RequestReplicationTracker class which is required by a call to the
SolrCmdDistributor.

-Steve

On Wed, Oct 1, 2014 at 5:51 PM, Steve Davids sdav...@gmail.com wrote:

 I was curious if it would make sense to provide a default command for
 atomic updates. Here is the use-case I am looking at... I want to keep
 track of the indexed-on date which is a snapshot in time of when that
 particular document was indexed for the first time. Currently I have that
 value stored and the value is set to NOW as the default value in the
 schema. Now, I want to actually set this value in the update chain prior to
 the distributed index request so all replicas obtain the exact same value.
 Unfortunately there isn't a way to specify use this new NOW date *only*
 if the value hasn't been indexed prior, so I was thinking that this can be
 simply handled by a default atomic update key that would only apply the
 value if the document that is to being merged doesn't already have a value
 specified. In addition to the validity of that thought, I was wondering if
 people would find it beneficial to allow sub-classing of the
 DistributedUpdateProcessorFactory (currently unable to due to inner
 classes) or at a minimum allow clients to specify their own merge logic
 implementation if they see fit.

 I would be happy to provide some patches if people think these are
 reasonable use-cases.

 Thanks,

 -Steve



[jira] [Created] (SOLR-6576) ModifiableSolrParams#add(SolrParams) is performing a set operation instead of an add

2014-09-30 Thread Steve Davids (JIRA)
Steve Davids created SOLR-6576:
--

 Summary: ModifiableSolrParams#add(SolrParams) is performing a set 
operation instead of an add
 Key: SOLR-6576
 URL: https://issues.apache.org/jira/browse/SOLR-6576
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk


Came across this bug by attempting to append multiple ModifiableSolrParam 
objects together but found the last one was clobbering the previously set 
values. The add operation should append the values to the previously defined 
values, not perform a set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6576) ModifiableSolrParams#add(SolrParams) is performing a set operation instead of an add

2014-09-30 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6576:
---
Attachment: SOLR-6576.patch

Fix + tests added in attached patch.

 ModifiableSolrParams#add(SolrParams) is performing a set operation instead of 
 an add
 

 Key: SOLR-6576
 URL: https://issues.apache.org/jira/browse/SOLR-6576
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6576.patch


 Came across this bug by attempting to append multiple ModifiableSolrParam 
 objects together but found the last one was clobbering the previously set 
 values. The add operation should append the values to the previously defined 
 values, not perform a set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6576) ModifiableSolrParams#add(SolrParams) is performing a set operation instead of an add

2014-09-30 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154316#comment-14154316
 ] 

Steve Davids commented on SOLR-6576:


Yea, this is a bit misleading as ModifiableSolrParams.add( String name, String 
... val ) says:

bq. Add the given values to any existing name

The behavior of this particular method works as expected, I would likewise 
assume that the add for SolrParams would work just the same way. That would 
be like saying we had a map with two methods: put(K key, V value) and 
putAll(Map? extends K,? extends V m) but did two completely different things.

So in my head I would think the method for the current functionality would 
mimic the set capability:

bq. Replace any existing parameter with the given name.

and should be named appropriately. Also, SolrParams.wrapAppended(SolrParams) is 
deprecated so that isn't very re-assuring to use :)

 ModifiableSolrParams#add(SolrParams) is performing a set operation instead of 
 an add
 

 Key: SOLR-6576
 URL: https://issues.apache.org/jira/browse/SOLR-6576
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.0, Trunk

 Attachments: SOLR-6576.patch


 Came across this bug by attempting to append multiple ModifiableSolrParam 
 objects together but found the last one was clobbering the previously set 
 values. The add operation should append the values to the previously defined 
 values, not perform a set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2014-09-20 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6496:
---
Attachment: SOLR-6496.patch

Updated patch to use nano time. Still thinking of some potential tricks to 
*not* use mocks, are there any tests out there that may screw with the jetty 
server to make the socket connection arbitrarily long then somehow throw an 
exception and verify that the next request isn't made?

On the mocking front I would do something like (note: redundant static 
Mockito.* accessors only showed for demonstrative purposes):
{code}
  @Test
  public void testNoRetryOnTimeout() throws Exception {
LBHttpSolrServer testServer = Mockito.spy(new 
LBHttpSolrServer(http://test1;, http://test2;));
Mockito.doAnswer(new AnswerException() {
  @Override
  public Exception answer(InvocationOnMock invocation) throws Throwable {
Thread.sleep(1);
return new SolrServerException(Mock error.);
  }}).when(testServer).doRequest(Mockito.any(HttpSolrServer.class), 
Mockito.any(Req.class), Mockito.any(Rsp.class), Mockito.anyBoolean(), 
Mockito.anyBoolean(), Mockito.anyString());
testServer.query(SolrTestCaseJ4.params(CommonParams.Q, test, 
CommonParams.TIME_ALLOWED, 1));
Mockito.verify(testServer, 
Mockito.times(1)).doRequest(Mockito.any(HttpSolrServer.class), 
Mockito.any(Req.class), Mockito.any(Rsp.class), Mockito.anyBoolean(), 
Mockito.anyBoolean(), Mockito.anyString());
  }
{code}

This test actually showed some strange behavior that there are multiple 
implemented methods trying to do the same thing. See LBHttpSolrServer's:
# public Rsp request(Req req) throws SolrServerException, IOException
# public NamedListObject request(final SolrRequest request) throws 
SolrServerException, IOException

So, depending on if you are using the CloudSolrServer or the 
HttpShardHandlerFactory you are going to get different request behavior. We 
should probably try to refactor this code to provide consistent behavior 
perhaps in a different ticket. 

 LBHttpSolrServer should stop server retries after the timeAllowed threshold 
 is met
 --

 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-6496.patch, SOLR-6496.patch, SOLR-6496.patch


 The LBHttpSolrServer will continue to perform retries for each server it was 
 given without honoring the timeAllowed request parameter. Once the threshold 
 has been met, you should no longer perform retries and allow the exception to 
 bubble up and allow the request to either error out or return partial results 
 per the shards.tolerant request parameter.
 For a little more context on how this is can be extremely problematic please 
 see the comment here: 
 https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
  (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2014-09-19 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141216#comment-14141216
 ] 

Steve Davids commented on SOLR-5986:


Looks good to me, the only nit-picky thing I would say is the QueryTimeoutBase 
name for an interface is strange, you may consider renaming it to 
QueryTimeout and rename the current QueryTimeout class to something along the 
lines of LuceneQueryTimeout / DefaultQueryTimeout / SimpleQueryTimeout? 

 Don't allow runaway queries from harming Solr cluster health or search 
 performance
 --

 Key: SOLR-5986
 URL: https://issues.apache.org/jira/browse/SOLR-5986
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 4.10

 Attachments: SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
 SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
 SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch


 The intent of this ticket is to have all distributed search requests stop 
 wasting CPU cycles on requests that have already timed out or are so 
 complicated that they won't be able to execute. We have come across a case 
 where a nasty wildcard query within a proximity clause was causing the 
 cluster to enumerate terms for hours even though the query timeout was set to 
 minutes. This caused a noticeable slowdown within the system which made us 
 restart the replicas that happened to service that one request, the worst 
 case scenario are users with a relatively low zk timeout value will have 
 nodes start dropping from the cluster due to long GC pauses.
 [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
 BLUR-142 (see commit comment for code, though look at the latest code on the 
 trunk for newer bug fixes).
 Solr should be able to either prevent these problematic queries from running 
 by some heuristic (possibly estimated size of heap usage) or be able to 
 execute a thread interrupt on all query threads once the time threshold is 
 met. This issue mirrors what others have discussed on the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Review Request 25658: Timeout queries when they take too long to rewrite/enumerate over terms.

2014-09-16 Thread Steve Davids

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25658/#review53648
---



trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeout.java
https://reviews.apache.org/r/25658/#comment93367

You can provide the shouldExit implementation in the abstract class if you 
make the getTimeoutAt() abstract.



trunk/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java
https://reviews.apache.org/r/25658/#comment93366

Is the reset necessary? Would it make sense to just start the 
SolrQueryTimeout clock right when a request is being serviced and let it run 
until the ThreadLocal is eventually destroyed?



trunk/solr/core/src/java/org/apache/solr/search/SolrQueryTimeout.java
https://reviews.apache.org/r/25658/#comment93365

You may want to consider providing a little more detail in the comments 
that timeOutAt is the time in the future in nanos. Would it make sense to just 
pass the timeout offset here and have it calculate the future time within the 
set method? i.e. pass in 1000ms instead of current time + offset. Or another 
alternative is to provide a date/calendar object in the future (may be a bit 
overkill but then you don't need to think twice of if you need to pass in the 
time in millis or nanos). (Also applies to the QueryTimeout too)


- Steve Davids


On Sept. 17, 2014, 2:32 a.m., Anshum Gupta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25658/
 ---
 
 (Updated Sept. 17, 2014, 2:32 a.m.)
 
 
 Review request for lucene.
 
 
 Bugs: SOLR-5986
 https://issues.apache.org/jira/browse/SOLR-5986
 
 
 Repository: lucene
 
 
 Description
 ---
 
 Timeout queries when they take too long to rewrite/enumerate over terms.
 
 
 Diffs
 -
 
   
 trunk/lucene/core/src/java/org/apache/lucene/index/ExitableDirectoryReader.java
  PRE-CREATION 
   trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeout.java 
 PRE-CREATION 
   trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeoutBase.java 
 PRE-CREATION 
   
 trunk/lucene/core/src/test/org/apache/lucene/index/TestExitableDirectoryReader.java
  PRE-CREATION 
   trunk/solr/core/src/java/org/apache/solr/handler/MoreLikeThisHandler.java 
 1625349 
   
 trunk/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java 
 1625349 
   trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java 
 1625349 
   trunk/solr/core/src/java/org/apache/solr/search/SolrQueryTimeout.java 
 PRE-CREATION 
   trunk/solr/core/src/test/org/apache/solr/TestDistributedSearch.java 1625349 
   trunk/solr/core/src/test/org/apache/solr/TestGroupingSearch.java 1625349 
   
 trunk/solr/core/src/test/org/apache/solr/core/ExitableDirectoryReaderTest.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25658/diff/
 
 
 Testing
 ---
 
 Added Lucene/Solr tests. Tested a bit manually.
 
 
 Thanks,
 
 Anshum Gupta
 




[jira] [Updated] (LUCENE-3120) span query matches too many docs when two query terms are the same unless inOrder=true

2014-09-16 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated LUCENE-3120:
-
Attachment: LUCENE-3120.patch

A user came across this odd behavior, attached a simple test case that was 
written before I came across this ticket which demonstrates the discrepancy.

 span query matches too many docs when two query terms are the same unless 
 inOrder=true
 --

 Key: LUCENE-3120
 URL: https://issues.apache.org/jira/browse/LUCENE-3120
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Reporter: Doron Cohen
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-3120.patch, LUCENE-3120.patch, LUCENE-3120.patch


 spinoff of user list discussion - [SpanNearQuery - inOrder 
 parameter|http://markmail.org/message/i4cstlwgjmlcfwlc].
 With 3 documents:
 *  a b x c d
 *  a b b d
 *  a b x b y d
 Here are a few queries (the number in parenthesis indicates expected #hits):
 These ones work *as expected*:
 * (1)  in-order, slop=0, b, x, b
 * (1)  in-order, slop=0, b, b
 * (2)  in-order, slop=1, b, b
 These ones match *too many* hits:
 * (1)  any-order, slop=0, b, x, b
 * (1)  any-order, slop=1, b, x, b
 * (1)  any-order, slop=2, b, x, b
 * (1)  any-order, slop=3, b, x, b
 These ones match *too many* hits as well:
 * (1)  any-order, slop=0, b, b
 * (2)  any-order, slop=1, b, b
 Each of the above passes when using a phrase query (applying the slop, no 
 in-order indication in phrase query).
 This seems related to a known overlapping spans issue - [non-overlapping Span 
 queries|http://markmail.org/message/7jxn5eysjagjwlon] - as indicated by Hoss, 
 so we might decide to close this bug after all, but I would like to at least 
 have the junit that exposes the behavior in JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Review Request 25658: Timeout queries when they take too long to rewrite/enumerate over terms.

2014-09-16 Thread Steve Davids


 On Sept. 17, 2014, 3:44 a.m., Steve Davids wrote:
  trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeout.java, line 
  56
  https://reviews.apache.org/r/25658/diff/4/?file=691686#file691686line56
 
  You can provide the shouldExit implementation in the abstract class if 
  you make the getTimeoutAt() abstract.
 
 Anshum Gupta wrote:
 shouldExit() is more intuitive for users to implement when they want 
 their own custom logic for exiting the queries.

Wouldn't it make sense to just go the interface route then? Not sure what the 
abstract class is doing for you at this point.


 On Sept. 17, 2014, 3:44 a.m., Steve Davids wrote:
  trunk/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java,
   line 258
  https://reviews.apache.org/r/25658/diff/4/?file=691690#file691690line258
 
  Is the reset necessary? Would it make sense to just start the 
  SolrQueryTimeout clock right when a request is being serviced and let it 
  run until the ThreadLocal is eventually destroyed?
 
 Anshum Gupta wrote:
 the reset is necessary for the explicit removal of the ThreadLocal value. 
 ThreadLocal doesn't clear itself implicitly up after the lifecycle of the 
 thread and so the reset is more than necessary.

Brain fart - yes threads can be pooled within the container and may service 
multiple requests over the thread life cycle.


- Steve


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25658/#review53648
---


On Sept. 17, 2014, 2:32 a.m., Anshum Gupta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25658/
 ---
 
 (Updated Sept. 17, 2014, 2:32 a.m.)
 
 
 Review request for lucene.
 
 
 Bugs: SOLR-5986
 https://issues.apache.org/jira/browse/SOLR-5986
 
 
 Repository: lucene
 
 
 Description
 ---
 
 Timeout queries when they take too long to rewrite/enumerate over terms.
 
 
 Diffs
 -
 
   
 trunk/lucene/core/src/java/org/apache/lucene/index/ExitableDirectoryReader.java
  PRE-CREATION 
   trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeout.java 
 PRE-CREATION 
   trunk/lucene/core/src/java/org/apache/lucene/index/QueryTimeoutBase.java 
 PRE-CREATION 
   
 trunk/lucene/core/src/test/org/apache/lucene/index/TestExitableDirectoryReader.java
  PRE-CREATION 
   trunk/solr/core/src/java/org/apache/solr/handler/MoreLikeThisHandler.java 
 1625349 
   
 trunk/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java 
 1625349 
   trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java 
 1625349 
   trunk/solr/core/src/java/org/apache/solr/search/SolrQueryTimeout.java 
 PRE-CREATION 
   trunk/solr/core/src/test/org/apache/solr/TestDistributedSearch.java 1625349 
   trunk/solr/core/src/test/org/apache/solr/TestGroupingSearch.java 1625349 
   
 trunk/solr/core/src/test/org/apache/solr/core/ExitableDirectoryReaderTest.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25658/diff/
 
 
 Testing
 ---
 
 Added Lucene/Solr tests. Tested a bit manually.
 
 
 Thanks,
 
 Anshum Gupta
 




[jira] [Commented] (LUCENE-5932) SpanNearUnordered duplicate term counts itself as a match

2014-09-10 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128971#comment-14128971
 ] 

Steve Davids commented on LUCENE-5932:
--

Oops, you are correct - it does appear to be a duplicate.

 SpanNearUnordered duplicate term counts itself as a match
 -

 Key: LUCENE-5932
 URL: https://issues.apache.org/jira/browse/LUCENE-5932
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
 Fix For: 4.11

 Attachments: LUCENE-5932.patch


 An unordered span near with the exact same term will count the first position 
 as a match for the second term.
 A document with values: w1 w2 w3 w4 w5
 Query hit: spanNear([w1, w1], 1, false) -- SpanNearUnordered
 Query miss: spanNear([w1, w1], 1, true) -- SpanNearOrdered (expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5932) SpanNearUnordered duplicate term counts itself as a match

2014-09-10 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids resolved LUCENE-5932.
--
   Resolution: Duplicate
Fix Version/s: (was: 4.11)

 SpanNearUnordered duplicate term counts itself as a match
 -

 Key: LUCENE-5932
 URL: https://issues.apache.org/jira/browse/LUCENE-5932
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
 Attachments: LUCENE-5932.patch


 An unordered span near with the exact same term will count the first position 
 as a match for the second term.
 A document with values: w1 w2 w3 w4 w5
 Query hit: spanNear([w1, w1], 1, false) -- SpanNearUnordered
 Query miss: spanNear([w1, w1], 1, true) -- SpanNearOrdered (expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2014-09-09 Thread Steve Davids (JIRA)
Steve Davids created SOLR-6496:
--

 Summary: LBHttpSolrServer should stop server retries after the 
timeAllowed threshold is met
 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Priority: Critical
 Fix For: 4.11


The LBHttpSolrServer will continue to perform retries for each server it was 
given without honoring the timeAllowed request parameter. Once the threshold 
has been met, you should no longer perform retries and allow the exception to 
bubble up and allow the request to either error out or return partial results 
per the shards.tolerant request parameter.

For a little more context on how this is can be extremely problematic please 
see the comment here: 
https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
 (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2014-09-09 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6496:
---
Attachment: SOLR-6496.patch

Initial patch that honors the timeAllowed request parameter. There aren't any 
tests included -- is there any objections to perhaps using a mocking library, 
it sure would make it much easier to perform unit testing on these negative 
cases. Mockito is my personal preference and is currently being used in 
Morphlines, but it will need to be included in the SolrJ test dependencies.

 LBHttpSolrServer should stop server retries after the timeAllowed threshold 
 is met
 --

 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Priority: Critical
 Fix For: 4.11

 Attachments: SOLR-6496.patch


 The LBHttpSolrServer will continue to perform retries for each server it was 
 given without honoring the timeAllowed request parameter. Once the threshold 
 has been met, you should no longer perform retries and allow the exception to 
 bubble up and allow the request to either error out or return partial results 
 per the shards.tolerant request parameter.
 For a little more context on how this is can be extremely problematic please 
 see the comment here: 
 https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
  (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2014-09-09 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127767#comment-14127767
 ] 

Steve Davids commented on SOLR-5986:


bq. I think this should be ok, specially considering the intention is to make 
sure that the request is killed and doesn't run forever.
+1, this is a good starting point and can be further refined in the future if 
need be.

I went ahead and opened SOLR-6496 to account for the LBHttpSolrServer's 
continual retries. Also, I am a little concerned that the cursorMark doesn't 
honor the timeAllowed request parameter for some strange reason (the cursorMark 
ticket didn't provide any rational for it), we may want to revisit that 
decision in yet another ticket so people can be confident their cursor mark 
queries won't crash their clusters as well.

Thanks for taking this on Anshum!

 Don't allow runaway queries from harming Solr cluster health or search 
 performance
 --

 Key: SOLR-5986
 URL: https://issues.apache.org/jira/browse/SOLR-5986
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 4.10

 Attachments: SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch


 The intent of this ticket is to have all distributed search requests stop 
 wasting CPU cycles on requests that have already timed out or are so 
 complicated that they won't be able to execute. We have come across a case 
 where a nasty wildcard query within a proximity clause was causing the 
 cluster to enumerate terms for hours even though the query timeout was set to 
 minutes. This caused a noticeable slowdown within the system which made us 
 restart the replicas that happened to service that one request, the worst 
 case scenario are users with a relatively low zk timeout value will have 
 nodes start dropping from the cluster due to long GC pauses.
 [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
 BLUR-142 (see commit comment for code, though look at the latest code on the 
 trunk for newer bug fixes).
 Solr should be able to either prevent these problematic queries from running 
 by some heuristic (possibly estimated size of heap usage) or be able to 
 execute a thread interrupt on all query threads once the time threshold is 
 met. This issue mirrors what others have discussed on the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6496) LBHttpSolrServer should stop server retries after the timeAllowed threshold is met

2014-09-09 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6496:
---
Attachment: SOLR-6496.patch

Fixed patch for null safe SolrParams check.

 LBHttpSolrServer should stop server retries after the timeAllowed threshold 
 is met
 --

 Key: SOLR-6496
 URL: https://issues.apache.org/jira/browse/SOLR-6496
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
Priority: Critical
 Fix For: 4.11

 Attachments: SOLR-6496.patch, SOLR-6496.patch


 The LBHttpSolrServer will continue to perform retries for each server it was 
 given without honoring the timeAllowed request parameter. Once the threshold 
 has been met, you should no longer perform retries and allow the exception to 
 bubble up and allow the request to either error out or return partial results 
 per the shards.tolerant request parameter.
 For a little more context on how this is can be extremely problematic please 
 see the comment here: 
 https://issues.apache.org/jira/browse/SOLR-5986?focusedCommentId=14100991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14100991
  (#2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5932) SpanNearUnordered duplicate term counts itself as a match

2014-09-09 Thread Steve Davids (JIRA)
Steve Davids created LUCENE-5932:


 Summary: SpanNearUnordered duplicate term counts itself as a match
 Key: LUCENE-5932
 URL: https://issues.apache.org/jira/browse/LUCENE-5932
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
 Fix For: 4.11


An unordered span near with the exact same term will count the first position 
as a match for the second term.

A document with values: w1 w2 w3 w4 w5

Query hit: spanNear([w1, w1], 1, false) -- SpanNearUnordered
Query miss: spanNear([w1, w1], 1, true) -- SpanNearOrdered (expected)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5932) SpanNearUnordered duplicate term counts itself as a match

2014-09-09 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated LUCENE-5932:
-
Attachment: LUCENE-5932.patch

Added patch with test case demonstrating the issue.

 SpanNearUnordered duplicate term counts itself as a match
 -

 Key: LUCENE-5932
 URL: https://issues.apache.org/jira/browse/LUCENE-5932
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.9
Reporter: Steve Davids
 Fix For: 4.11

 Attachments: LUCENE-5932.patch


 An unordered span near with the exact same term will count the first position 
 as a match for the second term.
 A document with values: w1 w2 w3 w4 w5
 Query hit: spanNear([w1, w1], 1, false) -- SpanNearUnordered
 Query miss: spanNear([w1, w1], 1, true) -- SpanNearOrdered (expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6483) Refactor some methods in MiniSolrCloudCluster tests

2014-09-06 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6483:
---
Attachment: SOLR-6483.patch

Initial patch which has allowed additional convenience methods in the 
MiniSolrCloudCluster including:

# Upload a config directory to ZooKeeper
# Create a collection
#* Added ability to provide collection properties
# Use a pre-configured CloudSolrServer instance

The TestMiniSolrCloudCluster has been refactored to use these new methods.

A few additional changes that should still be done:

# Provide waitForRecoveriesToFinish convenience method in MiniSolrCloudCluster
#* The code in the test is almost a direct copy/past from 
AbstractDistribZkTestBase.waitForRecoveriesToFinish, it would be nice to 
refactor this code into a common class (as this is not trivial code to 
maintain).
# All system properties were dropped *except* for solr.solrxml.location  
zkHost because those are necessary for Jetty to know where to pick up it's 
configuration on initial startup. It would be nice to see if there is an 
alternate way of getting that information to Jetty without setting the system 
property.

 Refactor some methods in MiniSolrCloudCluster tests
 ---

 Key: SOLR-6483
 URL: https://issues.apache.org/jira/browse/SOLR-6483
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0, 4.11
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-6483.patch


 Looking at whether we can provide some ease-of-use methods in 
 MiniSolrCloudCluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Some refactorings of MiniSolrCloudCluster and its test?

2014-09-04 Thread Steve Davids
I have recently changed some of my base tests to utilize the
MiniSolrCloudCluster and I ended up making a few tweaks from the
TestMiniSolrCloudCluster:

   - I used ZkController.uploadConfigDir(zkClient, configDir, coreName)
   instead of pulling in uploadConfigFileToZk  uploadConfigToZk, so perhaps
   just providing a uploadConfigDir method would be the best choice here.
   - I did use createCollection but added an optional properties file that
   added each property from the given file to the collections admin request
   with the 'property.' prefix to each property key. I had some specific
   properties that need to be loaded for the core to load properly.
  - The use of these properties allows us to ditch the System
  Properties being set, instead just use these property request parameters.
  ie set a request parameter of: property.solr.tests.maxBufferedDocs=
  10
   - Personally, I didn't use the waitForRecoveriesToFinish though I can
   see that being a useful option to provide. One issue is that this method
   performs fail assertions, it would be nice to switch that over to an
   exception so we don't necessarily need to force junit assertions in the
   MiniSolrCloudCluster this allows a little implementation flexibility.
   - I provided a CloudSolrServer getSolrServer() method addition

If you want, I can help out with some of these changes.

-Steve


On Thu, Sep 4, 2014 at 1:28 AM, Erick Erickson erickerick...@gmail.com
wrote:

 I have potential use for MiniSolrCloudCluster, and was just poking
 around at the code and a couple of improvements came to mind,
 wondering if I'm going off the deep end here.

 1 It seems like the methods in TestMiniSolrCloudCluster
uploadConfigToZk
uploadConfigFileToZk
createCollection
waitForRecoveriesToFinish

 should be moved to MiniSolrCloudCluster. It strikes me that these
 methods would be generally useful for any test subclassing
 MiniSolrCloudCluster. Which I intend to do I think.

 2 There exist two rules:
   @Rule
   public TestRule solrTestRules = RuleChain
   .outerRule(new SystemPropertiesRestoreRule());

   @ClassRule
   public static TestRule solrClassRules = RuleChain.outerRule(
   new SystemPropertiesRestoreRule()).around(
   new RevertDefaultThreadHandlerRule());


 yet the shutdown() method explicitly clears a bunch of system
 properties. I'm a little fuzzy on the ClassRule above, but is clearing
 the system props really necessary in the shutdown method? And if I
 move the methods to the MiniSolrCloudCluster that set the system
 props, I assume that I should move both rules to the base class too.

 And a special bonus for anyone who can give me a clue why both are
 needed, it's late and I'm going to sleep on it before tracking it
 down.

 Just to be clear, if people think these changes are a good idea, I'll
 take care of it as part of what I'm working on now.

 Thanks!
 Erick

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-09-03 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119803#comment-14119803
 ] 

Steve Davids commented on SOLR-4406:


bq. So i took a stab at refactoring it to have test methods that more directly 
modeled the list of situations you identified

Personally I prefer having small, self describing test method names instead of 
having 3 methods that do everything and making you really dig in there if any 
one of the tests actually fail. That's why I went the route of building 3 test 
methods per case I described above:

# If a content stream is provided send that back in the writer  output stream
#* testGetContentType
#* testWriteContentStreamViaWriter
#* testWriteContentStreamViaOutputStream
# If no content stream is provided and no base writer is specified verify the 
response is serialized with the default writer (XML)
#* testDefaultBaseWriterGetContentType
#* testDefaultBaseWriterViaWriter
#* testDefaultBaseWriterViaOutputStream
# If no content stream is provided and a base writer is specified serialize 
with the specified writer
#* testJsonBaseWriterGetContentType
#* testJsonBaseWriterViaWriter
#* testJsonBaseWriterViaOutputStream

Personally I think this is one of those beauty is in the eye of the beholder, 
I kind of prefer the original test but cleanliness and clarity can sometimes be 
subjective (though initRawResponseWriter was a poor naming choice, perhaps 
setBaseWriter would have been better). 

You are testing a couple more cases that I wasn't looking for before, which is 
always a good thing. All the other changes look good, I'm not hung up on any of 
the test changes.

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
Assignee: Hoss Man
 Attachments: SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch, 
 SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Plugins Class Loader Issue / Custom DistributedUpdateProcessor

2014-09-03 Thread Steve Davids
: That's a rule of Java - not anything special about Solr/Lucene.

Yes, I specifically made a package in my custom plugins code that was in the 
solr package space i.e.: org.apache.solr.update.processor

I did add the @Override to that specific package protected class and it did 
properly compile with no warnings or errors. This also worked completely fine 
within my unit tests as it was loaded with the same class loader (junit test 
run). I believe the specific issue that I came across was that the Solr WAR is 
loaded by the web container’s class loader while the separate “plugins” jar is 
loaded by Solr’s custom class loader. I believe at that point since the two 
were loaded separately the custom processor didn’t properly override the 
package protected method, though I do know the constructor was at least being 
called on the custom processor via various debugging/logging.

I will open a ticket about custom merge logic soon, though I do believe 
something is a bit screwy with the external resource loader as this should work 
(as long as the custom class is within the same package name as Solr).

-Steve

On Sep 3, 2014, at 6:40 PM, Chris Hostetter hossman_luc...@fucit.org wrote:

 
 : After a lot of debugging I found out that you cannot extend Solr/Lucene
 : classes and override package protected methods, it will silently us the
 : super class' method.
   ...
 : Has anyone happened to come across this and know if there is a fix for
 : extending  overriding package protected methods?
 
 That's a rule of Java - not anything special about Solr/Lucene.
 
 package protected means it's only visible to classes in the same package 
 -- jus because you (a human) can read that method, doesn't mean your code 
 has any knowledge of it's existence -- you can't override it in a 
 subclass, nor can you call it.  The compiler will fail hard if you try the 
 later, and warn you of the former if you use the @Override annotation 
 (it can't warn you of anything w/o the annotation, because it can't even 
 see the method to know that there is anything to warn you of - you have to 
 explicitly say you are epxecting to override something for it to be able 
 to tell you Hey, wait a minute - no you aren't)
 
 package protected is generally used either as an alternative to 
 public (when we only want other solr code in the same pacakge to access 
 it) or as an alterantive to private (in cases where ideally no one 
 outside the class should call that method, but we still want to be able to 
 unit test it).  In either case the objective is to minimize the surface 
 area of the API and hide implementation details.
 
 : For a little context, I came across this issue because a client requires
 : some slightly modified merge logic within the DistributedUpdateProcessor.
 
 Off the top of my head, i'm not sure what the motivation was here in this 
 specific case, or if/why RequestReplicationTracker shouldn't be public -- 
 but my suggestion would be to open a jira proposing the specific access
 changes you think make sense (to the method in question, and/or 
 the new inner class) to make the code more re-usable and see what folks 
 think about exposing that bit of the API as supported
 
 First though, you may want to take a step back, and instead focus on a 
 discussion/jira of what the slightly modified merge logic is that you 
 are currently using, and propose new hooks for you to do that in the most 
 optimal way (not sure if the answer's are the same ... don't want to 
 assume this isn't the birth of an XY problem)
 
 
 
 -Hoss
 http://www.lucidworks.com/
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



Re: Solr Plugins Class Loader Issue / Custom DistributedUpdateProcessor

2014-09-03 Thread Steve Davids
Ah, that explains it - thanks Hoss! I wasn’t aware of this class loader 
subtlety, thanks for teaching me something new today :)

-Steve

On Sep 3, 2014, at 7:53 PM, Chris Hostetter hossman_luc...@fucit.org wrote:

 
 : Yes, I specifically made a package in my custom plugins code that was in 
 : the solr package space i.e.: org.apache.solr.update.processor
   ...
 : completely fine within my unit tests as it was loaded with the same 
 : class loader (junit test run). I believe the specific issue that I came 
   ...
 : something is a bit screwy with the external resource loader as this 
 : should work (as long as the custom class is within the same package name 
 : as Solr).
 
 Ah, ok - so yeah: what you were doing at compile time was valid, but by 
 then loading those org.apache.solr.* classes as a plugin (vs putting 
 them in the war with the other classes in that pacakge) is definitely what 
 caused the problem -- at runtime class/package identity includes the 
 ClassLoader...
 
 http://www.cooljeff.co.uk/2009/05/03/the-subtleties-of-overriding-package-private-methods/
 
 ie: not a bug in Solr's plugin ClassLoader, just the devil-in-the-details 
 of how a package is defined as far as the runtime goes.  (And 
 unfortunately, the @Override check in java is source annotation only - 
 there's no way to ask the JVM to enforce that and treat it as an assertion 
 or anything like that)
 
 
 
 -Hoss
 http://www.lucidworks.com/
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Testing a custom distributed component

2014-08-30 Thread Steve Davids
If you don't want to use the BaseDistributedSearchTestCase you can utilize the 
newly introduced MiniSolrCloudCluster 
(http://lucene.apache.org/solr/4_9_0/solr-test-framework/org/apache/solr/cloud/MiniSolrCloudCluster.html)
 it works rather well. This class doesn't extend the base solr tests case so 
helper methods aren't there, instead you can use that class to spin up a 
CloudSolrServer to index/query to your liking within the test.

Sent from my iPhone

 On Aug 30, 2014, at 5:17 PM, Yonatan Nakar snak...@gmail.com wrote:
 
 I'm trying to write unit tests for a search component of my own. My component 
 is intended to run in a distributed setting only. The problem is that it 
 seems like Solr's testing framework doesn't make it easy to write unit tests 
 for distributed test components. What is the right way to test such a 
 component?
 
 More details about my problem here: 
 http://stackoverflow.com/questions/25586021/testing-a-solr-distributed-component
 


[jira] [Commented] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-28 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114631#comment-14114631
 ] 

Steve Davids commented on SOLR-4406:


[~hossman] Does the supplied tests fit the bill?

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch, 
 SOLR-4406.patch, SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr Plugins Class Loader Issue / Custom DistributedUpdateProcessor

2014-08-28 Thread Steve Davids
After a lot of debugging I found out that you cannot extend Solr/Lucene
classes and override package protected methods, it will silently us the
super class' method.

For a little context, I came across this issue because a client requires
some slightly modified merge logic within the DistributedUpdateProcessor.
In prior versions I was able to copy the entire processor and modify the
document merging section, unfortunately with the upgrade to 4.9 SOLR-5468
introduced an inner class (RequestReplicationTracker) which is required in
the SolrCmdDistributor - the inner class has a private constructor which
makes it impossible to build the required class from outside that specific
processor. So, I decided to just extend package protected
getUpdatedDocument method which resulted in my unit tests passing (since
the custom plugin + solr code are run from the same classpath), it was
discovered that the overridden code wasn't actually being used when
actually deployed.

For a quick fix, I simply converted the document merging method from
package protected to a standard protected method declaration and patched it
into the solr-core jar  solr war.

Has anyone happened to come across this and know if there is a fix for
extending  overriding package protected methods?

As a side note, I would like to open a ticket to either make the document
merging method override-able or ideally make the merge logic pluggable.


-Steve


Re: Logging levels in Solr code

2014-08-25 Thread Steve Davids

 I am personally in favour of some record of any request sent to a server
 being logged by default to help trace activity


It seems as though that would be more of a TRACE level item.

I have used the Log4J/SLF4J MDC to provide a distributed transaction id
which makes life much easier to trace requests throughout the request
chain, then display that transaction id for every item in the various logs.
It is simple to implement, create a HTTPClient HttpRequestInterceptor to
always append a header value to each request with the transaction UUID,
that transaction ID is either taken off of the request via a ServletFilter
or some other mechanism or automatically generated if it wasn’t present on
the incoming request. Then simply reference the MDC value in the Log4J
pattern via something like: %X{transactionId}.

I have found it extremely difficult to try and debug some of the
distributed requests especially when I know that there are some servers
that are having issues with socket connection timeouts, though the
LBSolrServer doesn’t log when exceptions are thrown or when retry attempts
are happening. I would love to see trace/debug level logging for initiating
requests and info/warn if a request was unsuccessful for any reason.

Just some thoughts, you really only come across these things when things
aren’t working right and it annoys you when the option isn’t there :)

-Steve

On Aug 25, 2014, at 6:04 PM, Mark Miller markrmil...@gmail.com wrote:



On Aug 25, 2014, at 5:21 PM, Ramkumar R. Aiyengar andyetitmo...@gmail.com
wrote:

I am personally in favour of some record of any request sent to a server
being logged by default to help trace activity


Certainly you should have the option to turn it on, but I don’t think it
makes a great default. I don’t think the standard user will find it that
useful and it will flood logs, making finding other useful information more
difficult and ballooning retention requirements so that you don’t lose
relevant logs. When you batch or stream, it also only logs a subset of the
adds by default.

- Mark

http://about.me/markrmiller
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


[jira] [Commented] (SOLR-6233) Provide basic command line tools for checking Solr status and health.

2014-08-24 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108633#comment-14108633
 ] 

Steve Davids commented on SOLR-6233:


One other note, it looks like you created your own Json parser to provide an 
xpath-like experience, there is a library that I have used before which is 
great named JsonPath (https://code.google.com/p/json-path/). Not sure if this 
specific class warrants bringing in that package but if we start seeing a 
higher need for similar mechanisms we may want to consider pulling it in since 
it does provide a much readable experience.

 Provide basic command line tools for checking Solr status and health.
 -

 Key: SOLR-6233
 URL: https://issues.apache.org/jira/browse/SOLR-6233
 Project: Solr
  Issue Type: Improvement
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Fix For: 5.0, 4.10

 Attachments: SOLR-6233-minor-refactor.patch


 As part of the start script development work SOLR-3617, example restructuring 
 SOLR-3619, and the overall curb appeal work SOLR-4430, I'd like to have an 
 option on the SystemInfoHandler that gives a shorter, well formatted JSON 
 synopsis of essential information. I know essential is vague ;-) but right 
 now using curl to http://host:port/solr/admin/info/system?wt=json gives too 
 much information when I just want a synopsis of a Solr server. 
 Maybe something like overview=true?
 Result would be:
 {noformat}
 {
   address: http://localhost:8983/solr;,
   mode: solrcloud,
   zookeeper: localhost:2181/foo,
   uptime: 2 days, 3 hours, 4 minutes, 5 seconds,
   version: 5.0-SNAPSHOT,
   status: healthy,
   memory: 4.2g of 6g
 }
 {noformat}
 Now of course, one may argue all this information can be easily parsed from 
 the JSON but consider cross-platform command-line tools that don't have 
 immediate access to a JSON parser, such as the bin/solr start script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6390) Remove unnecessary checked exception for CloudSolrServer constructor

2014-08-22 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106925#comment-14106925
 ] 

Steve Davids commented on SOLR-6390:


I can go ahead and update the patch later this evening.

 Remove unnecessary checked exception for CloudSolrServer constructor
 

 Key: SOLR-6390
 URL: https://issues.apache.org/jira/browse/SOLR-6390
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
Assignee: Shawn Heisey
Priority: Trivial
 Fix For: 5.0

 Attachments: SOLR-6390.patch


 The CloudSolrServer constructors can be simplified and can remove an 
 unnecessary checked exception for one of the 4 constructors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6390) Remove unnecessary checked exception for CloudSolrServer constructor

2014-08-22 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6390:
---

Attachment: SOLR-6390.patch

Updated patch to add more descriptive javadocs for the CloudSolrServer 
constructors found in issue SOLR-5852.

 Remove unnecessary checked exception for CloudSolrServer constructor
 

 Key: SOLR-6390
 URL: https://issues.apache.org/jira/browse/SOLR-6390
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Davids
Assignee: Shawn Heisey
Priority: Trivial
 Fix For: 5.0

 Attachments: SOLR-6390.patch, SOLR-6390.patch


 The CloudSolrServer constructors can be simplified and can remove an 
 unnecessary checked exception for one of the 4 constructors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6233) Provide basic command line tools for checking Solr status and health.

2014-08-22 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-6233:
---

Attachment: SOLR-6233-minor-refactor.patch

I was looking over some of the command line tools (extremely useful), and 
refactored some of the code to make it a little more readable.

 Provide basic command line tools for checking Solr status and health.
 -

 Key: SOLR-6233
 URL: https://issues.apache.org/jira/browse/SOLR-6233
 Project: Solr
  Issue Type: Improvement
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Fix For: 5.0, 4.10

 Attachments: SOLR-6233-minor-refactor.patch


 As part of the start script development work SOLR-3617, example restructuring 
 SOLR-3619, and the overall curb appeal work SOLR-4430, I'd like to have an 
 option on the SystemInfoHandler that gives a shorter, well formatted JSON 
 synopsis of essential information. I know essential is vague ;-) but right 
 now using curl to http://host:port/solr/admin/info/system?wt=json gives too 
 much information when I just want a synopsis of a Solr server. 
 Maybe something like overview=true?
 Result would be:
 {noformat}
 {
   address: http://localhost:8983/solr;,
   mode: solrcloud,
   zookeeper: localhost:2181/foo,
   uptime: 2 days, 3 hours, 4 minutes, 5 seconds,
   version: 5.0-SNAPSHOT,
   status: healthy,
   memory: 4.2g of 6g
 }
 {noformat}
 Now of course, one may argue all this information can be easily parsed from 
 the JSON but consider cross-platform command-line tools that don't have 
 immediate access to a JSON parser, such as the bin/solr start script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-21 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4406:
---

Attachment: SOLR-4406.patch

Added tests which provides complete code coverage of the RawResponseWriter. I 
didn't go the mocking route, instead it is an integration test by spinning up a 
core to assert 3 different cases:

1) If a content stream is provided send that back in the writer  output stream
2) If no content stream is provided and no base writer is specified verify the 
response is serialized with the default writer (XML)
3) If no content stream is provided and a base writer is specified serialize 
with the specified writer

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-21 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4406:
---

Attachment: SOLR-4406.patch

Made a small tweak to use the Java 7 auto-closeable for the Writer/Output 
streams.

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch, 
 SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-21 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4406:
---

Attachment: SOLR-4406.patch

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch, SOLR-4406.patch, SOLR-4406.patch, 
 SOLR-4406.patch, SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2014-08-19 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102854#comment-14102854
 ] 

Steve Davids commented on SOLR-5986:


Correct, I was just providing additional insight into the issues we have been 
seeing.

 Don't allow runaway queries from harming Solr cluster health or search 
 performance
 --

 Key: SOLR-5986
 URL: https://issues.apache.org/jira/browse/SOLR-5986
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 4.10

 Attachments: SOLR-5986.patch


 The intent of this ticket is to have all distributed search requests stop 
 wasting CPU cycles on requests that have already timed out or are so 
 complicated that they won't be able to execute. We have come across a case 
 where a nasty wildcard query within a proximity clause was causing the 
 cluster to enumerate terms for hours even though the query timeout was set to 
 minutes. This caused a noticeable slowdown within the system which made us 
 restart the replicas that happened to service that one request, the worst 
 case scenario are users with a relatively low zk timeout value will have 
 nodes start dropping from the cluster due to long GC pauses.
 [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
 BLUR-142 (see commit comment for code, though look at the latest code on the 
 trunk for newer bug fixes).
 Solr should be able to either prevent these problematic queries from running 
 by some heuristic (possibly estimated size of heap usage) or be able to 
 execute a thread interrupt on all query threads once the time threshold is 
 met. This issue mirrors what others have discussed on the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2014-08-18 Thread Steve Davids (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100991#comment-14100991
 ] 

Steve Davids commented on SOLR-5986:


We came across the issue again and added a lot more probes to get a grasp on 
what exactly is happening, I believe further tickets might be necessary to 
address various pieces.

#1) We are setting the timeout request parameter which tells the 
TimeLimitingCollector to throw a TimeExceededException, though in our logs we 
see the error messages thrown after about an hour for one of the queries we 
tried, even though the timeout is set for a couple of minutes. This is 
presumably due to the query parsing taking about an hour and once the query is 
finally parsed and handed to the collector the TimeLimitingCollector 
immediately throws in exception. We should have something similar throw the 
same exception while in the query building phase (this way the partial results 
warnings will continue to just work). It looks like the current work is more in 
the realm of solving this issue which may fix the problems we saw described in 
#2.

#2) We set socket read timeouts on HTTPClient which causes the same query to be 
sent into the cluster multiple times giving it a slow, painful death. This is 
even more problematic while using the SolrJ API, what ends up happening from 
SolrJ's LBHttpSolrServer is that it will loop through *every* host in the 
cluster and if a socket read timeout happens it tries the next item in the 
list. Internally every single request made to the cluster from an outside SolrJ 
client will try to gather the results for all shards in the cluster, once a 
socket read timeout happens internal to the cluster the same retry logic will 
attempt to gather results from the next replica in the list. So, if we 
hypothetically had 10 shards with 3 replicas, and made a request from an 
outside client it would make 30 (external SolrJ call to each host to request a 
distributed search) * 30 (each host will be called at least once for the 
internal distributed request) = 900 overall requests (each individual search 
host will handle 30 requests). This should probably become it's own ticket to 
track, to either a) don't retry on a socket read timeout or b) specify a retry 
timeout of some sort in the LBHttpSolrServer (this is something we did 
internally for simplicity sake).

 Don't allow runaway queries from harming Solr cluster health or search 
 performance
 --

 Key: SOLR-5986
 URL: https://issues.apache.org/jira/browse/SOLR-5986
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Steve Davids
Assignee: Anshum Gupta
Priority: Critical
 Fix For: 4.10

 Attachments: SOLR-5986.patch


 The intent of this ticket is to have all distributed search requests stop 
 wasting CPU cycles on requests that have already timed out or are so 
 complicated that they won't be able to execute. We have come across a case 
 where a nasty wildcard query within a proximity clause was causing the 
 cluster to enumerate terms for hours even though the query timeout was set to 
 minutes. This caused a noticeable slowdown within the system which made us 
 restart the replicas that happened to service that one request, the worst 
 case scenario are users with a relatively low zk timeout value will have 
 nodes start dropping from the cluster due to long GC pauses.
 [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
 BLUR-142 (see commit comment for code, though look at the latest code on the 
 trunk for newer bug fixes).
 Solr should be able to either prevent these problematic queries from running 
 by some heuristic (possibly estimated size of heap usage) or be able to 
 execute a thread interrupt on all query threads once the time threshold is 
 met. This issue mirrors what others have discussed on the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-18 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4406:
---

Attachment: SOLR-4406.patch

Attached a patch which allows the RawResponseWriter to honor it's contract:

--snip--
..if no such ContentStream has been added, then a base QueryResponseWriter 
will be used to write the response according to the usual contract...
--snip--

Performed some minor refactoring to provide a single method to write query 
responses to an output stream.

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4406) RawResponseWriter - doesn't use the configured base writer.

2014-08-18 Thread Steve Davids (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Davids updated SOLR-4406:
---

Attachment: SOLR-4406.patch

Oops, didn't add the new utility class to SVN - patch updated.

 RawResponseWriter - doesn't use the configured base writer.
 -

 Key: SOLR-4406
 URL: https://issues.apache.org/jira/browse/SOLR-4406
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 4.0
Reporter: Steve Davids
 Attachments: SOLR-4406.patch, SOLR-4406.patch


 The RawResponseWriter accepts a configuration value for a base 
 ResponseWriter if no ContentStream can be detected. The line of code is 
 commented out that would allow this secondary response writer to work. It 
 would be great to uncomment the line and provide an OutputStreamWriter as the 
 writer argument.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >