[jira] [Commented] (LUCENE-3982) regex support in queryparser needs documented, and called out in CHANGES.txt

2012-04-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253908#comment-13253908
 ] 

Hoss Man commented on LUCENE-3982:
--

Not: set to blocker so we don't release 4.0 with this change in syntax w/o 
documenting

> regex support in queryparser needs documented, and called out in CHANGES.txt
> 
>
> Key: LUCENE-3982
> URL: https://issues.apache.org/jira/browse/LUCENE-3982
> Project: Lucene - Java
>  Issue Type: Sub-task
>  Components: core/queryparser
>Reporter: Hoss Man
>Priority: Blocker
> Fix For: 4.0
>
>
> Spun off of LUCENE-2604 where everyone agreed this needed done, but no one 
> has done it yet, and rmuir didn't want to leave the issue open...
> {quote}
> some issues were pointed out in a recent mailing list thread that definitely 
> seem like they should be addressed before this is officially released...
> * queryparsersyntax.xml doesn't mention this feature at all -- as major new 
> syntax is should really get it's own section with an example showing the 
> syntax
> * queryparsersyntax.xml's section on "Escaping Special Characters" needs to 
> mention that '/' is a special character
> Also: Given that Yury encountered some real world situations in which the new 
> syntax caused problems with existing queries, it seems like we should 
> definitely make a note about this possibility more promonient ... i'm not 
> sure if it makes sense in MIGRATE.txt but at a minimum it seems like the 
> existing CHANGES.txt entry should mention it, maybe something like...
> {noformat}
> * LUCENE-2604: Added RegexpQuery support to QueryParser. Regular expressions
>   are now directly supported by the standard queryparser using the syntax... 
>  fieldName:/expression/ OR /expression against default field/
>   Users who wish to search for literal "/" characters are advised to 
>   backslash-escape or quote those characters as needed. 
>   (Simon Willnauer, Robert Muir)
> {noformat}
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3330) Show changes in plugin statistics across multiple requests

2012-04-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253754#comment-13253754
 ] 

Hoss Man commented on SOLR-3330:


Sorry .. i was looking at the solr logging (SolrCore.execute) ... because it's 
using "Content-Type: application/x-www-form-urlencoded" and the stream.body 
param, it's all being included in the list of SolrParams that get logged.

So my concern about extra long URLs breaking is a non-issue, but it's still 
kind of noisy as far as solr logging goes.

If it was changed to use "Content-Type: application/xml" and send the xml 
directly then it wouldn't be counted as a solr param, but the handler would 
still get it as a ContentStream.

---

as for how it looks: in my initial impression i didn't realize that it was 
recording values core all the categories of plugins (ie: i was looking at 
"Query Handlers" and didn't notice the little grey numbers indicating that 
"Caches" also had some changes) ... the #BBA500 color used to make the plugin 
names with changes standout is great (even if you remove the "new" icon 
completely) so maybe just using that same color on the category names (or at 
least the little numbers indicating that items in that category have changed) 
would be helpful to draw attention to them?



> Show changes in plugin statistics across multiple requests
> --
>
> Key: SOLR-3330
> URL: https://issues.apache.org/jira/browse/SOLR-3330
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Ryan McKinley
> Fix For: 4.0
>
> Attachments: SOLR-3330-pluggins-diff.patch, 
> SOLR-3330-pluggins-diff.patch, SOLR-3330-plugins.png, 
> SOLR-3330-record-changes-ui.patch, SOLR-3330-record-changes-ui.patch
>
>
> When debugging configuration and performance, I often:
>  1. Look at stats values
>  2. run some queries
>  3. See how the stats values changed
> This is fine, but is often a bit clunky and you have to really know what you 
> are looking for to see any changes.
> It would be great if the 'plugins' page had a button that would make this 
> easier

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2605) CoreAdminHandler, different Output while 'defaultCoreName' is specified

2012-04-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253736#comment-13253736
 ] 

Hoss Man commented on SOLR-2605:


Stefan: are you sure you had a clean build with my patch applied?

when i run...

{noformat}
java -DzkRun -Dcollection.configName=myconf -Dbootstrap_confdir=./solr/conf 
-Dsolr.environment=dev -Duser.timezone=UTC -DhostPort=8983 -Djetty.port=8983 
-jar start.jar
{noformat}

I get...

{noformat}
hossman@bester:~/lucene/dev/solr$ curl 
"http://localhost:8983/solr/zookeeper?detail=true&path=%2Fclusterstate.json";
{"znode":{
"path":"/clusterstate.json","prop":{
  "version":5,
  "aversion":0,
  "children_count":0,
  "ctime":"Fri Apr 13 20:27:46 UTC 2012 (1334348866331)",
  "cversion":0,
  "czxid":12,
  "dataLength":290,
  "ephemeralOwner":0,
  "mtime":"Fri Apr 13 20:45:41 UTC 2012 (1334349941866)",
  "mzxid":207,
  "pzxid":12},
"data":"{\"collection1\":{\"shard1\":{\"bester:8983_solr_collection1\":{\n  
  \"shard\":\"shard1\",\n\"leader\":\"true\",\n
\"state\":\"active\",\n\"core\":\"collection1\",\n
\"collection\":\"collection1\",\n\"node_name\":\"bester:8983_solr\",\n  
  \"base_url\":\"http://bester:8983/solr\""},"tree":[{"data":{
"title":"/clusterstate.json","attr":{
  "href":"zookeeper?detail=true&path=%2Fclusterstate.json"}}}]}
{noformat}

> CoreAdminHandler, different Output while 'defaultCoreName' is specified
> ---
>
> Key: SOLR-2605
> URL: https://issues.apache.org/jira/browse/SOLR-2605
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2399-admin-cores-default.xml, 
> SOLR-2399-admin-cores.xml, SOLR-2605.patch, SOLR-2605.patch
>
>
> The attached XML-Files show the little difference between a defined 
> {{defaultCoreName}}-Attribute and a non existing one.
> Actually the new admin ui checks for an core with empty name to set single- / 
> multicore-settings .. it's a quick change to count the number of defined 
> cores instead.
> But, will it be possible, to get the core-name (again)? One of both 
> attributes would be enough, if that makes a difference :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3330) Show changes in plugin statistics across multiple requests

2012-04-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253703#comment-13253703
 ] 

Hoss Man commented on SOLR-3330:


can we change this to use an HTTP POST body instead of stream.body request 
param? ... it's sending some really long ass request URLs that might not work 
if a servlet container is configured to limit the URL length.

> Show changes in plugin statistics across multiple requests
> --
>
> Key: SOLR-3330
> URL: https://issues.apache.org/jira/browse/SOLR-3330
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Ryan McKinley
> Fix For: 4.0
>
> Attachments: SOLR-3330-pluggins-diff.patch, 
> SOLR-3330-pluggins-diff.patch, SOLR-3330-plugins.png, 
> SOLR-3330-record-changes-ui.patch, SOLR-3330-record-changes-ui.patch
>
>
> When debugging configuration and performance, I often:
>  1. Look at stats values
>  2. run some queries
>  3. See how the stats values changed
> This is fine, but is often a bit clunky and you have to really know what you 
> are looking for to see any changes.
> It would be great if the 'plugins' page had a button that would make this 
> easier

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3978) redo how our download redirect pages work

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253126#comment-13253126
 ] 

Hoss Man commented on LUCENE-3978:
--

Uwe: if i'm understanding that page correctly, this would only be possible for 
links where:
 a) link html is on our site
 b) we can control the html used to generate them
...which isfine for the bug buttons on lucene.apache.org, and any other 
download links we might want to include on those CMS pages, but not for things 
like links from wiki.apache.org, or the URLs we include in our plain text 
release announcement emails (that users just cut/paste) or that we submit to 
any other site to promote the release.


> redo how our download redirect pages work
> -
>
> Key: LUCENE-3978
> URL: https://issues.apache.org/jira/browse/LUCENE-3978
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
> Fix For: 4.0
>
>
> the download "latest" redirect pages are kind of a pain to change when we 
> release a new version...
> http://lucene.apache.org/core/mirrors-core-latest-redir.html
> http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3327) Logging UI should indicate which loggers are set vs implicit

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252954#comment-13252954
 ] 

Hoss Man commented on SOLR-3327:


bq. It should state on top that these are the JDK logging levels. If people 
switch logging through SLF4J it won't work

i wonder if there is a way for the LoggingServlet (request handler?) to detect 
which SL4J binding is in use, and spit out a warning if it's not JDK, so the UI 
can conditionally display that warning if it exists.

> Logging UI should indicate which loggers are set vs implicit
> 
>
> Key: SOLR-3327
> URL: https://issues.apache.org/jira/browse/SOLR-3327
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Ryan McKinley
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: SOLR-3327.patch, logging.png
>
>
> The new logging UI looks great!
> http://localhost:8983/solr/#/~logging
> It would be nice to indicate which ones are set explicitly vs implicit -- 
> perhaps making the line bold when set=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3977) generated/duplicated javadocs are wasteful and bloat the release

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252944#comment-13252944
 ] 

Hoss Man commented on LUCENE-3977:
--

bq. Really if we have different modules like contrib-analyzers, why can't they 
link to the things they depend on (e.g. lucene-core) just like the solr 
javadocs do?

i think the original argument in favor of having both styles was:

* the all version makes it easy to see (in the left pane) all the classes that 
are available when people are working with the entire code base
* the individual module versions, even when cross linked with eachother, make 
it easy to see exactly what is included in a single module (via the left pane)

at this point in my life, i don't really have an opinion, as long as we include 
at least one copy in the bin release.

bq. We can save 10MB with this patch, which nukes the 'index

oh god yes, i didn't even realize we were building that useless pile of crap

> generated/duplicated javadocs are wasteful and bloat the release
> 
>
> Key: LUCENE-3977
> URL: https://issues.apache.org/jira/browse/LUCENE-3977
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: general/javadocs
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: 4.0
>
>
> Some stats for the generated javadocs of 3.6:
> * 9,146 files
> * 161,872 KB uncompressed
> * 25MB compressed (this is responsible for nearly half of our binary release)
> The fact we intentionally double our javadocs size with the 'javadocs-all' 
> thing
> is truly wasteful and compression doesn't help at all. Just testing, i nuked 
> 'all'
> and found:
> * 4,944 files
> * 81,084 KB uncompressed
> * 12.8MB compressed
> We need to clean this up for 4.0. We only need to ship javadocs 'one way'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3978) redo how our download redirect pages work

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252940#comment-13252940
 ] 

Hoss Man commented on LUCENE-3978:
--

when we released 3.6, we ran into a few annoyances...

* these pages require that you edit the template (not availbale in the 
bookmarklet) to change the 3.5.0 to 3.6.0 in the final URL
* these pages were in browser caches, so they weren't seeing the cahnges in the 
javascript redirect (rmuir added some no-cache metadata headers, so hopefully 
this won't be a problem again)

My suggestion for the future...

* eliminate these templates and their mdtext pages entirely
* replace them with a .htaccess redirect rule that looks like: 
{{/([^/*)/(.*)-latest-redir.html /$1/$2-redir.html?3.6.0}}
* update the templates for mirrors-solr-redir.mdtext and 
mirrors-core-redir.mdtext so that the javascript will use the query string when 
building the final URL

...that way whenever we release a new version, we can just tweak the .htaccess 
rule, and the only "html pages" that might ever show up in an http or browser 
caches will have unique URLs per version.


> redo how our download redirect pages work
> -
>
> Key: LUCENE-3978
> URL: https://issues.apache.org/jira/browse/LUCENE-3978
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
> Fix For: 4.0
>
>
> the download "latest" redirect pages are kind of a pain to change when we 
> release a new version...
> http://lucene.apache.org/core/mirrors-core-latest-redir.html
> http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252851#comment-13252851
 ] 

Hoss Man commented on LUCENE-3973:
--

bq. I believe both pmd and findbugs are on maven repos so one could use ivy to 
fetch them automatically. One thing less to think about.

Unless you run into the same taskdef/classloader/sub-build/permgen-OOM problem 
we had with clover, and the maven-ant-tasks, and ivy that have prevented us 
from doing the same thing with them.



> Incorporate PMD / FindBugs
> --
>
> Key: LUCENE-3973
> URL: https://issues.apache.org/jira/browse/LUCENE-3973
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Chris Male
>
> This has been touched on a few times over the years.  Having static analysis 
> as part of our build seems like a big win.  For example, we could use PMD to 
> look at {{System.out.println}} statements like discussed in LUCENE-3877 and 
> we could possibly incorporate the nocommit / @author checks as well.
> There are a few things to work out as part of this:
> - Should we use both PMD and FindBugs or just one of them? They look at code 
> from different perspectives (bytecode vs source code) and target different 
> issues.  At the moment I'm in favour of trying both but that might be too 
> heavy handed for our needs.
> - What checks should we use? There's no point having the analysis if it's 
> going to raise too many false-positives or problems we don't deem 
> problematic.  
> - How should the analysis be integrated in our build? Need to work out when 
> the analysis should run, how it should be incorporated in Ant and/or Maven, 
> what impact errors should have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-12 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252735#comment-13252735
 ] 

Hoss Man commented on LUCENE-3973:
--

bq. How should the analysis be integrated in our build? Need to work out when 
the analysis should run, how it should be incorporated in Ant and/or Maven, 
what impact errors should have.

i would suggest going about it incrementally...

* hook into build.xml as optional targets that can be run if you have the 
neccessary libs installed, don't fail the build just generate the XML report 
files
* put the needed libs on builds.apache.org, and hook it into the jenkins 
nightly target, and configure jenkins to display it's pretty version of the xml 
reports so people can at least see what's going on.
* start adding/tweaking custom rule sets in dev-tools to eliminate rules we 
don't care about, add rules we want that don't exist, or change the severity of 
rules we think are more/less important
* tweak the build.xml to fail if anything above some arbitrary severity is 
tripped
* worry about maven





> Incorporate PMD / FindBugs
> --
>
> Key: LUCENE-3973
> URL: https://issues.apache.org/jira/browse/LUCENE-3973
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Chris Male
>
> This has been touched on a few times over the years.  Having static analysis 
> as part of our build seems like a big win.  For example, we could use PMD to 
> look at {{System.out.println}} statements like discussed in LUCENE-3877 and 
> we could possibly incorporate the nocommit / @author checks as well.
> There are a few things to work out as part of this:
> - Should we use both PMD and FindBugs or just one of them? They look at code 
> from different perspectives (bytecode vs source code) and target different 
> issues.  At the moment I'm in favour of trying both but that might be too 
> heavy handed for our needs.
> - What checks should we use? There's no point having the analysis if it's 
> going to raise too many false-positives or problems we don't deem 
> problematic.  
> - How should the analysis be integrated in our build? Need to work out when 
> the analysis should run, how it should be incorporated in Ant and/or Maven, 
> what impact errors should have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3076) Solr should support block joins

2012-04-10 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250910#comment-13250910
 ] 

Hoss Man commented on SOLR-3076:



As i said before...

bq. ...perhaps we should focus on the more user explicit, direct mapping type 
QParser type approach Mikhail has already started on for now, and consider that 
(_schema driven implicit block joining_) as an enhancement later? (especially 
since it's not clear how the indexing side will be managed/enforced...)

what Mikhail's fleshed out here seems like a good starting point for users who 
are willing to deal with this at the "low" level (similar in expertness to the 
"raw" QParser) , and would be usable *today* for people who take responsibility 
of indexing the blocks themselves.

if/when/how we decide to drive the indexing side, we can think about how 
if/where/how to automagically hook blockjoin queries into "higher" level 
parsers like LuceneQParser, DismaxQueryParser

> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Attachments: SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, 
> SOLR-3076.patch, SOLR-3076.patch, bjq-vs-filters-backward-disi.patch, 
> bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, 
> parent-bjq-qparser.patch, parent-bjq-qparser.patch, 
> solrconf-bjq-erschema-snippet.xml, tochild-bjq-filtered-search-fix.patch
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250196#comment-13250196
 ] 

Hoss Man commented on SOLR-3335:


ignoring the seed, and just trying the test with "-Dtests.nightly=true" i've 
only seen this test pass once (and i might have had a typo in that nightly 
param -- it was the first time i tried it and i didn't have a shell log).

Unless i'm missing something...

* BaseDistributedSearchTestCase.createServers initializes the following 
pairwise...
** protected List jettys
** protected List clients
* TestDistributedSearch.doTest then...
** copies those lists into local upJettys and upClients instances and maintains 
a list of "upShards"
** iteratively shutsdown some number of jetty instances, removing from 
upJettys, upShards, and upClients
** passes upShards and upClients to queryPartialResults
* TestDistributedSearch.queryPartialResults ...
** does some random quering of upShards and upClients
** if stress is non-zero (which it is if it's nightly) then it also spins up a 
bunch of threads using a client from the original "clients" list

...which seems fundamentally flawed to me ... because each "client" knows about 
a specific jetty instance, and the test has explicitly shut down some jetty 
instances.

Is this just a typo?  are the refs to "clients" in queryPartialResults all just 
suppose to be "upClients" ?


> testDistribSearch failure
> -
>
> Key: SOLR-3335
> URL: https://issues.apache.org/jira/browse/SOLR-3335
> Project: Solr
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 4.0
>
>
> Happened on my test machine. Is there a way to disable these tests if we 
> cannot fix them? There are two three tests that fail most of the time and 
> that apparently nobody knows how to fix (including me).
> There is also a typo in the error message (I'm away from home for Easter, 
> can't do it now).
> {noformat}
> build 06-Apr-2012 16:11:54[junit] Testsuite: 
> org.apache.solr.cloud.RecoveryZkTest
> build 06-Apr-2012 16:11:54[junit] Testcase: 
> testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
> build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
> build 06-Apr-2012 16:11:54[junit] 
> junit.framework.AssertionFailedError: There are still nodes recoverying
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.Assert.fail(Assert.java:93)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> build 06-Apr-2012 16:11:54[junit] at 
> java.lang.reflect.Method.invoke(Method.java:597)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(Syst

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250177#comment-13250177
 ] 

Hoss Man commented on SOLR-3335:


* nightly builds seem to fail almost every time
* test only builds seem to pass almost every time

...are folks remembering to use "-Dtests.nightly=true" when trying to reproduce 
this?

I tried the reproduce line from nightly build #1819 and got the same 
ConnectException as jenkins three times in a row...

{noformat}
hossman@bester:~/lucene/dev/solr$ ant test -Dtestcase=TestDistributedSearch 
-Dtestmethod=testDistribSearch 
-Dtests.seed=-64cffe89df6d3a71:-2543436b41d480f3:21aa64ce023d4a8a 
-Dtests.nightly=true -Dargs="-Dfile.encoding=ISO8859-1"
{noformat}

> testDistribSearch failure
> -
>
> Key: SOLR-3335
> URL: https://issues.apache.org/jira/browse/SOLR-3335
> Project: Solr
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 4.0
>
>
> Happened on my test machine. Is there a way to disable these tests if we 
> cannot fix them? There are two three tests that fail most of the time and 
> that apparently nobody knows how to fix (including me).
> There is also a typo in the error message (I'm away from home for Easter, 
> can't do it now).
> {noformat}
> build 06-Apr-2012 16:11:54[junit] Testsuite: 
> org.apache.solr.cloud.RecoveryZkTest
> build 06-Apr-2012 16:11:54[junit] Testcase: 
> testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
> build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
> build 06-Apr-2012 16:11:54[junit] 
> junit.framework.AssertionFailedError: There are still nodes recoverying
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.Assert.fail(Assert.java:93)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> build 06-Apr-2012 16:11:54[junit] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> build 06-Apr-2012 16:11:54[junit] at 
> java.lang.reflect.Method.invoke(Method.java:597)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
> build 06-Apr-2012 16:11:54[junit] at 
> org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.rules.RunRules.evaluate(RunRules.java:18)
> build 06-Apr-2012 16:11:54[junit] at 
> org.junit.runners.P

[jira] [Commented] (SOLR-3329) Use consistent svn:keywords

2012-04-05 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247959#comment-13247959
 ] 

Hoss Man commented on SOLR-3329:


These were all really designed originally for people writting plugins to be 
able to expose more information for their consumers about them that might not 
be obvious based on the global info about the Solr intall.

As for stuff in the solr source tree, i would suggest..
* getSource() - keep using $URL$, it doesn't really hurt anything.
* getVersion() - we should just start returning the implementaiton version from 
the package metadata
* getSourceId() - $Id$ is the most problematic svn keyword i've ever seen, lets 
just drop it and leave this blank in all the core mbeans ... plugin writers can 
use it however they want

> Use consistent svn:keywords
> ---
>
> Key: SOLR-3329
> URL: https://issues.apache.org/jira/browse/SOLR-3329
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
> Fix For: 4.0
>
>
> In solr, use use svn:keywords haphazardly
> We have lots of places with:
> {code}
> svn propset svn:keywords "Date Author Id Revision HeadURL" *.java
> {code}
> In LUCENE-3923, there is a suggestion to get rid of many of these.
> The MBeans interface often exposes HeadURL, but we likely want to get rid of 
> the rest

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3961) don't build and rebuild jar files for dependencies in tests

2012-04-05 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247801#comment-13247801
 ] 

Hoss Man commented on LUCENE-3961:
--

bq. We currently don't generally use jars as the actual classpath for testing 
though

understood, #1 is just an argument i've seen as to why it would be better to do 
so -- otherwise we never actually know when testing that our jars are useful -- 
someone could accidentally put "excludes="*.class" on a jar task and you'd 
never notice because all the tests would still pass.

bq. by never creating a jar in the first place your #2 doesn't happen at all 
really.

note step #a ... the point is if someone does whatever officially blessed step 
there is to build the jars ("ant", "ant jar", "ant whatever") and then decides 
they want to change the behavior of those jars -- they may never run "ant 
clean" and it may not occur to then to  re-run whatever that official way to 
build jars is and they may not notice that the jar's aren't rebuilt when they 
do "ant test" -- because they can already see the new code was "compiled" and 
running based on the test output.

bq. Also, if we were to go with your logic, really we should be rebuilding the 
solr.war everytime

correct, a war is just a jar with a special structure

bq. (I'm just pointing out why i think its infeasible). ... I think we need to 
keep this stuff fast so that compile-test-debug lifecycle is as fast as possible

agreed ... like i said, i don't have a strong opinion about it, but since we're 
discussing it i just wanted to point out the arguments i've heard over and over 
when having this discussion in the past on other projects.

I think in an ideal world, devs could run fast tests against ../*/classes/ 
directories, but jenkins would run all those same tests against fully build 
jars to ensure they aren't missing anything ... but that would probably be an 
anoying build.xml to maintain


> don't build and rebuild jar files for dependencies in tests
> ---
>
> Key: LUCENE-3961
> URL: https://issues.apache.org/jira/browse/LUCENE-3961
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.0
>
>
> Hossman's comments about when jars are built had me thinking,
> its not really great how dependencies are managed currently.
> say i have contrib/hamburger that depends on contrib/cheese
> if I do 'ant test' in contrib/hamburger, you end out with a situation
> where you have no hamburger.jar but you have a cheese.jar.
> The reason for this: i think is how we implement the contrib-uptodate,
> via .jar files. I think instead contrib-uptodate shouldnt use actual
> jar files (cheese.jar) but a simple file we 'touch' like cheese.compiled.
> This will make the build faster, especially I think the solr tests
> which uses these dependencies across a lot of lucene modules. we won't
> constantly jar their stuff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3328) executable bits of shellscripts in solr source release

2012-04-05 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247783#comment-13247783
 ] 

Hoss Man commented on SOLR-3328:


bq. I don't believe that.

and to be clear:
* these *.sh files are executable if you "unzip" the solr.zip on a unix box
* these *.sh files are executable if you "tar -xzf" the solr.tgz on a unix box
* it is only if you "tar -xzf" the solr-src.txt thta these files are not 
executable

bq. I don't know if we can improve this? Maybe its an svn prop?

these files already have the svn:executable property set...

{noformat}
hossman@bester:~/lucene/3x_dev/solr/example/exampledocs$ svn propget 
svn:executable post.sh test_utf8.sh
post.sh - *
test_utf8.sh - *
{noformat}

...so it must either be something about how we do the export, or we are not 
telling the tar task to track the perms properly (i'm guessing the later)

> executable bits of shellscripts in solr source release
> --
>
> Key: SOLR-3328
> URL: https://issues.apache.org/jira/browse/SOLR-3328
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: Robert Muir
> Fix For: 4.0
>
>
> HossmanSays: in the solr src releases, some shell scripts are not executable 
> by default.
> I don't know if we can improve this? Maybe its an svn prop?
> Maybe something needs to be specified to the tar/zip process?
> Currently the 'source release' is really an svn export...
> Personally i always do 'sh foo.sh' rather than './foo.sh',
> but if it makes it more user-friendly we should figure it out
> Just opening the issue since we don't forget about it, I think solr cloud
> adds some more shell scripts so we should at least figure out what we want to 
> do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3961) don't build and rebuild jar files for dependencies in tests

2012-04-05 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247778#comment-13247778
 ] 

Hoss Man commented on LUCENE-3961:
--

I don't have a strong opinion about this, but there are two counter arguments 
i've heard from over the years made in favor of *always* building the jar(s) 
even though it's a bit slower for the tests...

1) it mean you always test against hte same jars that you ship -- so there is 
no risk that the classpath you build for testing is subtly different then the 
files that make it into the jar (ie: maybe contrib/cheeseburger/build.xml 
copies a cheese_types.xml file into it's classes dir, but it accidentally gets 
excluded from contrib-cheeses.jar

2) it means less risk that someone accidentally uses an older jar then they 
thing...

a) "ant something" ... builds contrib-hamburger.jar and contrib-cheese.jar
b) you realize it doesn't work the way you want, so you apply a patch (with 
tests!)
c) "ant test" rebuilds contrib/*/classes and you see your new hamburger test 
passes
d) you copy contrib-hamburger.jar and contrib-cheese.jar not realizing they are 
still left over from #a above, and don't have your patch.


> don't build and rebuild jar files for dependencies in tests
> ---
>
> Key: LUCENE-3961
> URL: https://issues.apache.org/jira/browse/LUCENE-3961
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.0
>
>
> Hossman's comments about when jars are built had me thinking,
> its not really great how dependencies are managed currently.
> say i have contrib/hamburger that depends on contrib/cheese
> if I do 'ant test' in contrib/hamburger, you end out with a situation
> where you have no hamburger.jar but you have a cheese.jar.
> The reason for this: i think is how we implement the contrib-uptodate,
> via .jar files. I think instead contrib-uptodate shouldnt use actual
> jar files (cheese.jar) but a simple file we 'touch' like cheese.compiled.
> This will make the build faster, especially I think the solr tests
> which uses these dependencies across a lot of lucene modules. we won't
> constantly jar their stuff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"

2012-04-05 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247348#comment-13247348
 ] 

Hoss Man commented on LUCENE-3946:
--

thanks shawn, i added that as a suggestion on the wiki...

http://wiki.apache.org/lucene-java/HowToContribute#antivy

moving forward if we get questions from users about ivy problems we just need 
to iterate and update the wiki with what works best

> improve docs & ivy verification output to explain classpath problems and 
> mention "--noconfig"
> -
>
> Key: LUCENE-3946
> URL: https://issues.apache.org/jira/browse/LUCENE-3946
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3946.patch
>
>
> offshoot of LUCENE-3930, where shawn reported...
> {quote}
> I can't get either branch_3x or trunk to build now, on a system that used to 
> build branch_3x without complaint.  It
> says that ivy is not available, even after doing "ant ivy-bootstrap" to 
> download ivy into the home directory.
> Specifically I am trying to build solrj from trunk, but I can't even get 
> "ant" in the root directory of the checkout
> to work.  I'm on CentOS 6 with oracle jdk7 built using the city-fan.org 
> SRPMs.  Ant (1.7.1) and junit are installed
> from package repositories.  Building a checkout of lucene_solr_3_5 on the 
> same machine works fine.
> {quote}
> The root cause is that ant's global configs can be setup to ignore the users 
> personal lib dir.  suggested work arround is to run "ant --noconfig" but we 
> should also try to give the user feedback in our failure about exactly what 
> classpath ant is currently using (because apparently ${java.class.path} is 
> not actually it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3952) validate depends on compile-tools, which does too much

2012-04-04 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246685#comment-13246685
 ] 

Hoss Man commented on LUCENE-3952:
--

with commit r1309556 you can no longer "ant clean compile" from the top level 
of checkout...

{noformat}
hossman@bester:~/lucene/dev$ ant clean compile
...
validate:
 [echo] Building spatial...

validate:
 [echo] Building suggest...

validate:
  [taskdef] Could not load definitions from resource lucene-solr.antlib.xml. It 
could not be found.
 [echo] License check under: /home/hossman/lucene/dev/modules

BUILD FAILED
/home/hossman/lucene/dev/build.xml:68: The following error occurred while 
executing this line:
/home/hossman/lucene/dev/modules/build.xml:68: The following error occurred 
while executing this line:
/home/hossman/lucene/dev/lucene/tools/custom-tasks.xml:22: Problem: failed to 
create task or type licenses
Cause: The name is undefined.
Action: Check the spelling.
Action: Check that any custom tasks/types have been declared.
Action: Check that any / declarations have taken place.


Total time: 14 seconds
{noformat}

> validate depends on compile-tools, which does too much
> --
>
> Key: LUCENE-3952
> URL: https://issues.apache.org/jira/browse/LUCENE-3952
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3952.patch
>
>
> lucene's common-build.xml 'validate' depends on compile-tools, but some
> modules like icu, kuromoji, etc have a compile-tools target (for other 
> reasons).
> I think it should explicitly depend on common.compile-tools instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"

2012-04-04 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246534#comment-13246534
 ] 

Hoss Man commented on LUCENE-3946:
--

I added some starter text to 
http://wiki.apache.org/lucene-java/HowToContribute#antivy

I also went ahead and commited the existing patch as is, minus the classpath 
stuff, to the trunk: r1309511.

Anyone object to merging this back to 3.6?

--

bq. Commenting out "rpm_mode=true" in ant.conf made it work with just "ant 
test" as the command.

Shawn: can you try reverting your change to /etc/ant.conf and instead add 
"rpm_mode=false" to a new $HOME/.ant/ant.conf file and see if that works just 
as well? ... if so we should add it to the wiki as a suggestion


> improve docs & ivy verification output to explain classpath problems and 
> mention "--noconfig"
> -
>
> Key: LUCENE-3946
> URL: https://issues.apache.org/jira/browse/LUCENE-3946
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3946.patch
>
>
> offshoot of LUCENE-3930, where shawn reported...
> {quote}
> I can't get either branch_3x or trunk to build now, on a system that used to 
> build branch_3x without complaint.  It
> says that ivy is not available, even after doing "ant ivy-bootstrap" to 
> download ivy into the home directory.
> Specifically I am trying to build solrj from trunk, but I can't even get 
> "ant" in the root directory of the checkout
> to work.  I'm on CentOS 6 with oracle jdk7 built using the city-fan.org 
> SRPMs.  Ant (1.7.1) and junit are installed
> from package repositories.  Building a checkout of lucene_solr_3_5 on the 
> same machine works fine.
> {quote}
> The root cause is that ant's global configs can be setup to ignore the users 
> personal lib dir.  suggested work arround is to run "ant --noconfig" but we 
> should also try to give the user feedback in our failure about exactly what 
> classpath ant is currently using (because apparently ${java.class.path} is 
> not actually it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn & src releases to verify the jars are the ones we expect

2012-04-04 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246516#comment-13246516
 ] 

Hoss Man commented on LUCENE-3945:
--

Committed revision 1309503. - trunk

rmuir said on irc that he'd work on backporting to 3x for me (going to grab 
some lunch soon and then get on a plane)

> we should include checksums for every jar ivy fetches in svn & src releases 
> to verify the jars are the ones we expect
> -
>
> Key: LUCENE-3945
> URL: https://issues.apache.org/jira/browse/LUCENE-3945
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Hoss Man
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3945.patch, LUCENE-3945.patch, LUCENE-3945.patch, 
> LUCENE-3945_trunk_jar_sha1.patch, LUCENE-3945_trunk_jar_sha1.patch, 
> LUCENE-3945_trunk_jar_sha1.patch
>
>
> Conversation with rmuir last night got me thinking about the fact that one 
> thing we lose by using ivy is confidence that every user of a release is 
> compiling against (and likely using at run time) the same dependencies as 
> every other user.
> Up to 3.5, users of src and binary releases could be confident that the jars 
> included in the release were the same jars the lucene devs vetted and tested 
> against when voting on the release candidate, but with ivy there is now the 
> possibility that after the source release is published, the owner of a domain 
> where these dependencies are hosted might change the jars in some way w/o 
> anyone knowing.  Likewise: we as developers could commit an ivy.xml file 
> pointing to a specific URL which we then use for and test for months, and 
> just prior to a release, the contents of the remote URL could change such 
> that a JAR included in the binary artifacts might not match the ones we've 
> vetted and tested leading up to that RC.
> So i propose that we include checksum files in svn and in our source releases 
> that can be used by users to verify that the jars they get from ivy match the 
> jars we tested against.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"

2012-04-04 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246494#comment-13246494
 ] 

Hoss Man commented on LUCENE-3946:
--

bq. I think rather than suggesting the --noconfig option in the patch, we should
just reword the text to suggest instead installing your own ant (which worked 
for both you and Mike) rather than using any system-installed one on Linux 
systems.

Given that "--noconfig" or editing ant.conf to remove rpm_mode _may_ solve the 
problem, and that many people are likely to consider either of those things 
simpler to do then installing a clean version of ant (even though you and i 
would probably disagree) i think we should still suggest them possible fixes.


> improve docs & ivy verification output to explain classpath problems and 
> mention "--noconfig"
> -
>
> Key: LUCENE-3946
> URL: https://issues.apache.org/jira/browse/LUCENE-3946
> Project: Lucene - Java
>  Issue Type: Task
>Affects Versions: 3.6
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 4.0
>
> Attachments: LUCENE-3946.patch
>
>
> offshoot of LUCENE-3930, where shawn reported...
> {quote}
> I can't get either branch_3x or trunk to build now, on a system that used to 
> build branch_3x without complaint.  It
> says that ivy is not available, even after doing "ant ivy-bootstrap" to 
> download ivy into the home directory.
> Specifically I am trying to build solrj from trunk, but I can't even get 
> "ant" in the root directory of the checkout
> to work.  I'm on CentOS 6 with oracle jdk7 built using the city-fan.org 
> SRPMs.  Ant (1.7.1) and junit are installed
> from package repositories.  Building a checkout of lucene_solr_3_5 on the 
> same machine works fine.
> {quote}
> The root cause is that ant's global configs can be setup to ignore the users 
> personal lib dir.  suggested work arround is to run "ant --noconfig" but we 
> should also try to give the user feedback in our failure about exactly what 
> classpath ant is currently using (because apparently ${java.class.path} is 
> not actually it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245911#comment-13245911
 ] 

Hoss Man commented on LUCENE-3946:
--


bq. I passed --execdebug to ant, and when it fails (w/ the builtin Fedora ant) 
I get this:

the interesting thing being that neither of those lines actually seem to 
contain your ivy.jar -- but when it fails for you, the java.class.path echoing 
that my patch adds to the ivy-check target does show ivy in _that_ classpath 
(even though it's clearly not the one being used to load the taskdef) ... so 
something in the actual Launcher class is deciding when/if to add that ivy jar 
to that java.class.path (which again: is clearly not hte classpath that 
actually seems to matter)

bq. So forget about loading ivy, I think these ants shipped with linux 
distributions are hopelessly broken and I don't think there is a lot we can do.

that's not really fair ... many distros split things up ito multiple pacakges, 
you probably have the core one but not some optional ones.

as mike has shown it's clearly possible to get a functional ant with a fedora 
install, but you do have to override/edit a config setting

bq. Maybe this 'compiled-on-date' is available via an ant property we can early 
detect?

that *REALLY* smells bad ... and would go out of it's way to break things for 
people who might have already fixed their ant install (using "--noconfig" or 
edited /etc/ant.conf)

I think it's enough to make the failure message say "we did our best, try 
--noconfig and see the URL below for more info about how your ant install may 
be fucked up" ... if we can show them the _correct_ classpath ant is trying to 
use t load ivy, to make the point clear, then great -- if not, then we rip it 
out of hte error message


> improve docs & ivy verification output to explain classpath problems and 
> mention "--noconfig"
> -
>
> Key: LUCENE-3946
> URL: https://issues.apache.org/jira/browse/LUCENE-3946
> Project: Lucene - Java
>  Issue Type: Task
>Affects Versions: 3.6
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 4.0
>
> Attachments: LUCENE-3946.patch
>
>
> offshoot of LUCENE-3930, where shawn reported...
> {quote}
> I can't get either branch_3x or trunk to build now, on a system that used to 
> build branch_3x without complaint.  It
> says that ivy is not available, even after doing "ant ivy-bootstrap" to 
> download ivy into the home directory.
> Specifically I am trying to build solrj from trunk, but I can't even get 
> "ant" in the root directory of the checkout
> to work.  I'm on CentOS 6 with oracle jdk7 built using the city-fan.org 
> SRPMs.  Ant (1.7.1) and junit are installed
> from package repositories.  Building a checkout of lucene_solr_3_5 on the 
> same machine works fine.
> {quote}
> The root cause is that ant's global configs can be setup to ignore the users 
> personal lib dir.  suggested work arround is to run "ant --noconfig" but we 
> should also try to give the user feedback in our failure about exactly what 
> classpath ant is currently using (because apparently ${java.class.path} is 
> not actually it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245836#comment-13245836
 ] 

Hoss Man commented on LUCENE-3946:
--

Related...
http://ant.1045680.n5.nabble.com/Ant-and-rpm-mode-td1353437.html
http://stackoverflow.com/questions/1909634/why-does-ant-ignore-task-jars-in-home-ant-lib


> improve docs & ivy verification output to explain classpath problems and 
> mention "--noconfig"
> -
>
> Key: LUCENE-3946
> URL: https://issues.apache.org/jira/browse/LUCENE-3946
> Project: Lucene - Java
>  Issue Type: Task
>Affects Versions: 3.6
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 4.0
>
>
> offshoot of LUCENE-3930, where shawn reported...
> {quote}
> I can't get either branch_3x or trunk to build now, on a system that used to 
> build branch_3x without complaint.  It
> says that ivy is not available, even after doing "ant ivy-bootstrap" to 
> download ivy into the home directory.
> Specifically I am trying to build solrj from trunk, but I can't even get 
> "ant" in the root directory of the checkout
> to work.  I'm on CentOS 6 with oracle jdk7 built using the city-fan.org 
> SRPMs.  Ant (1.7.1) and junit are installed
> from package repositories.  Building a checkout of lucene_solr_3_5 on the 
> same machine works fine.
> {quote}
> The root cause is that ant's global configs can be setup to ignore the users 
> personal lib dir.  suggested work arround is to run "ant --noconfig" but we 
> should also try to give the user feedback in our failure about exactly what 
> classpath ant is currently using (because apparently ${java.class.path} is 
> not actually it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245828#comment-13245828
 ] 

Hoss Man commented on LUCENE-3930:
--

Shawn: please see LUCENE-3946 and comment there about the suggested work around

> nuke jars from source tree and use ivy
> --
>
> Key: LUCENE-3930
> URL: https://issues.apache.org/jira/browse/LUCENE-3930
> Project: Lucene - Java
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
> LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
> LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
> LUCENE-3930__ivy_bootstrap_target.patch, 
> LUCENE-3930_includetestlibs_excludeexamplexml.patch, 
> ant_-verbose_clean_test.out.txt, langdetect-1.1.jar, 
> noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml
>
>
> As mentioned on the ML thread: "switch jars to ivy mechanism?".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245788#comment-13245788
 ] 

Hoss Man commented on LUCENE-3943:
--

"ant example" doesn't run the example, so it can't use any ephemeral classpaths 
that ant creates on the fly.  "ant example" currently sets up the example files 
(ie: copying the war to where jetty will look for it) hence my point that it 
could copy the jars as needed in order for jetty & solr to find them

(the example has to work in the binary build via "java -jar start.jar" even if 
users don't have any of the original build.xml files)

> Use ivy cachepath and cachefileset instead of ivy retrieve
> --
>
> Key: LUCENE-3943
> URL: https://issues.apache.org/jira/browse/LUCENE-3943
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Chris Male
>
> In LUCENE-3930 we moved to resolving all external dependencies using 
> ivy:retrieve.  This process places the dependencies into the lib/ folder of 
> the respective modules which was ideal since it replicated the existing build 
> process and limited the number of changes to be made to the build.
> However it can lead to multiple jars for the same dependency in the lib 
> folder when the dependency is upgraded, and just isn't the most efficient way 
> to use Ivy.
> Uwe pointed out that _when working from svn or in using src releases_ we can 
> remove the ivy:retrieve calls and make use of ivy:cachepath and 
> ivy:cachefileset to build our classpaths and packages respectively, which 
> will go some way to addressing these limitations -- however we still need the 
> build system capable of putting the actual jars into specific lib folders 
> when assembling the binary artifacts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245613#comment-13245613
 ] 

Hoss Man commented on LUCENE-3943:
--

bq. In my opinion, the ideal situation would be that we pass these filesets 
directly to the zip/tar/gz whatever in the binary release targets

the one catch that occurs to me is the solr example: start.jar, the libraries 
jetty looks for, and the optional jar's solr load by path based on it's 
configuration ... we just have to make sure "ant example" takes care of putting 
all those jars where they need to be 

> Use ivy cachepath and cachefileset instead of ivy retrieve
> --
>
> Key: LUCENE-3943
> URL: https://issues.apache.org/jira/browse/LUCENE-3943
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Chris Male
>
> In LUCENE-3930 we moved to resolving all external dependencies using 
> ivy:retrieve.  This process places the dependencies into the lib/ folder of 
> the respective modules which was ideal since it replicated the existing build 
> process and limited the number of changes to be made to the build.
> However it can lead to multiple jars for the same dependency in the lib 
> folder when the dependency is upgraded, and just isn't the most efficient way 
> to use Ivy.
> Uwe pointed out that _when working from svn or in using src releases_ we can 
> remove the ivy:retrieve calls and make use of ivy:cachepath and 
> ivy:cachefileset to build our classpaths and packages respectively, which 
> will go some way to addressing these limitations -- however we still need the 
> build system capable of putting the actual jars into specific lib folders 
> when assembling the binary artifacts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn & src releases to verify the jars are the ones we expect

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245603#comment-13245603
 ] 

Hoss Man commented on LUCENE-3945:
--

#1: I know that Ivy attempts MD5 & SHA1 verification by default -- but it does 
that verification against checksum files located on the server, so it only 
offers protection against corruption in transit, not against files deliberately 
modified on the server.

#2 i realize that the maintainers of maven repos say "all files are immutable" 
and that this potential risk of malicious or accidental file changes exists for 
all maven users -- but that's the choise of all maven users to accept that as a 
way of life.  I'm raising this issue only to point out a discrepancy between 
the "confidence" we use to be able to give people who download src releases, vs 
what we have currently with ivy.

> we should include checksums for every jar ivy fetches in svn & src releases 
> to verify the jars are the ones we expect
> -
>
> Key: LUCENE-3945
> URL: https://issues.apache.org/jira/browse/LUCENE-3945
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Hoss Man
> Fix For: 3.6, 4.0
>
>
> Conversation with rmuir last night got me thinking about the fact that one 
> thing we lose by using ivy is confidence that every user of a release is 
> compiling against (and likely using at run time) the same dependencies as 
> every other user.
> Up to 3.5, users of src and binary releases could be confident that the jars 
> included in the release were the same jars the lucene devs vetted and tested 
> against when voting on the release candidate, but with ivy there is now the 
> possibility that after the source release is published, the owner of a domain 
> where these dependencies are hosted might change the jars in some way w/o 
> anyone knowing.  Likewise: we as developers could commit an ivy.xml file 
> pointing to a specific URL which we then use for and test for months, and 
> just prior to a release, the contents of the remote URL could change such 
> that a JAR included in the binary artifacts might not match the ones we've 
> vetted and tested leading up to that RC.
> So i propose that we include checksum files in svn and in our source releases 
> that can be used by users to verify that the jars they get from ivy match the 
> jars we tested against.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245447#comment-13245447
 ] 

Hoss Man commented on LUCENE-3943:
--

bq. In my opinion, the ideal situation would be that we pass these filesets 
directly to the zip/tar/gz whatever in the binary release targets

Ah... ok didn't think of that... +1

> Use ivy cachepath and cachefileset instead of ivy retrieve
> --
>
> Key: LUCENE-3943
> URL: https://issues.apache.org/jira/browse/LUCENE-3943
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Chris Male
>
> In LUCENE-3930 we moved to resolving all external dependencies using 
> ivy:retrieve.  This process places the dependencies into the lib/ folder of 
> the respective modules which was ideal since it replicated the existing build 
> process and limited the number of changes to be made to the build.
> However it can lead to multiple jars for the same dependency in the lib 
> folder when the dependency is upgraded, and just isn't the most efficient way 
> to use Ivy.
> Uwe pointed out that _when working from svn or in using src releases_ we can 
> remove the ivy:retrieve calls and make use of ivy:cachepath and 
> ivy:cachefileset to build our classpaths and packages respectively, which 
> will go some way to addressing these limitations -- however we still need the 
> build system capable of putting the actual jars into specific lib folders 
> when assembling the binary artifacts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3200) When using SignatureUpdateProcessor with "all fields" configuration, it will assume only the fields present on the very first document only, ignoring any optional fields

2012-04-02 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244800#comment-13244800
 ] 

Hoss Man commented on SOLR-3200:


Committed revision 1308604. - trunk

still testing backport to 3x

> When using SignatureUpdateProcessor with "all fields" configuration, it will 
> assume only the fields present on the very first document only, ignoring any 
> optional fields in subsequent documents in the signature generation.
> --
>
> Key: SOLR-3200
> URL: https://issues.apache.org/jira/browse/SOLR-3200
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.4, 3.1, 3.2, 3.3, 3.4, 3.5, 4.0
>Reporter: Spyros Kapnissis
>Assignee: Hoss Man
> Fix For: 3.6
>
> Attachments: SOLR-3200.patch
>
>
> This can result in non-duplicate documents being left out of the index. A 
> solution would be that the fields to be used in the signature generation are 
> recalculated with every document inserted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3226) SignatureUpdateProcessor ignores non-string field values from the signature generation

2012-04-02 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244799#comment-13244799
 ] 

Hoss Man commented on SOLR-3226:


Committed revision 1308604. - trunk

...had to make a tweak to schema-luceneMatchVersion.xml to get all tests 
working however (TestMatchVersions uses same solrconfig.xml but diff 
schema.xml, so it freaked about "id" not existing)

still testing the backport to 3x ... there were some other subtle tweaks needed 
there to the test because of branch drift


> SignatureUpdateProcessor ignores non-string field values from the signature 
> generation
> --
>
> Key: SOLR-3226
> URL: https://issues.apache.org/jira/browse/SOLR-3226
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.4, 3.1, 3.2, 3.3, 3.4, 3.5, 4.0
>Reporter: Spyros Kapnissis
>Assignee: Hoss Man
> Fix For: 3.6
>
> Attachments: SOLR-3226.patch, SOLR-3226.patch
>
>
> When using for example XMLUpdateRequestProcessor, the signature is calculated 
> correctly since all field values are strings. But when one uses 
> DataImportHandler or BinaryUpdateRequestHandler, the signature generation 
> will ignore any field values that are ints, longs, dates etc. 
> This might result in overwriting non-similar documents, as it happened in my 
> case while importing some db data through DIH.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243266#comment-13243266
 ] 

Hoss Man commented on LUCENE-3930:
--

I did some testing of the "packages" built using trunk (circa r1307608)...

* we don't ship solr's build.xml (or any of the sub-build.xml files) in the 
"binary" artifacts, and with these changes most of the new ivy.xml files are 
also excluded -- but for some reason these newly added files are showing up, we 
should probably figure out why and exclude them as well since they aren't 
usable and could easily people...
** ./example/example-DIH/ivy.xml
** ./example/example-DIH/build.xml
** ./example/ivy.xml
** ./example/build.xml
* the lib's for test-framework (ant, ant-junit, and junit) aren't being 
included in the lucene "binary" artifacts ... for the ant jars this might 
(test-framework doesn't actually have any run-time deps on anything in ant does 
it?) but it seems like hte junit jar should be included since including 
lucene-test-framework.jar in your classpath is useless w/o also including junit
* "ant ivy-bootstrap" followed by "ant test" using the lucene "source" package 
(lucene-4.0-SNAPSHOT-src.tgz) produces a build failure -- but this may have 
been a problem even before ivy (note the working dir and the final error)...

{noformat}
hossman@bester:~/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT$ ant test
...
[junit] Testsuite: org.apache.lucene.util.junitcompat.TestReproduceMessage
[junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 0.114 sec
[junit] 

test:

compile-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java

compile-core:

compile-test-framework:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

compile-lucene-core:

compile-core:

compile-test:
 [echo] Building demo...

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

common.init:

compile-lucene-core:

contrib-build.init:

check-lucene-core-uptodate:

jar-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java

compile-core:

jar-core:
  [jar] Building jar: 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/lucene-core-4.0-SNAPSHOT.jar

init:

compile-test:
 [echo] Building demo...

check-analyzers-common-uptodate:

jar-analyzers-common:

BUILD FAILED
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build.xml:487: The 
following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:1026:
 The following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/contrib/contrib-build.xml:58:
 The following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:551:
 Basedir /home/hossman/tmp/ivy-pck-testing/lu/src/modules/analysis/common does 
not exist

Total time: 5 minutes 10 seconds
{noformat}

...it's trying to reach back up out of the working directory into "../modules"

> nuke jars from source tree and use ivy
> --
>
> Key: LUCENE-3930
> URL: https://issues.apache.org/jira/browse/LUCENE-3930
> Project: Lucene - Java
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.6
>
> Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
> LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
> LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-39

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240943#comment-13240943
 ] 

Hoss Man commented on LUCENE-3930:
--

as far as the PermGen OOM goes, the -verbose logs show repeated instances of 
"Trying to override old definition of task antlib:org.apache.ivy.ant:..." 
suggesting that rmuir's {{unless="ivy.uptodate"}} isn't working the way we 
think it should (possibly because of the way the various build files are 
included in one another?)

If we can't keep the {{https://ant.apache.org/manual/Tasks/typedef.html
{quote}
If you are defining tasks or types that share the same classpath with multiple 
taskdef or typedef tasks, the corresponding classes will be loaded by different 
Java ClassLoaders. Two classes with the same name loaded via different 
ClassLoaders are not the same class from the point of view of the Java VM, they 
don't share static variables and instances of these classes can't access 
private methods or attributes of instances defined by "the other class" of the 
same name. They don't even belong to the same Java package and can't access 
package private code, either.

The best way to load several tasks/types that are supposed to cooperate with 
each other via shared Java code is to use the resource attribute and an antlib 
descriptor. If this is not possible, the second best option is to use the 
loaderref attribute and specify the same name for each and every 
typedef/taskdef - this way the classes will share the same ClassLoader. Note 
that the typedef/taskdef tasks must use identical classpath defintions (this 
includes the order of path components) for the loaderref attribute to work.
{quote}
...it appears it's just some unique string key to name the classloader? (no 
idea if it whatever is causing hte current problem will still plague by 
creating multiple loaders with the same name)...
https://svn.apache.org/viewvc/ant/core/tags/ANT_171/src/etc/testcases/core/loaderref/

> nuke jars from source tree and use ivy
> --
>
> Key: LUCENE-3930
> URL: https://issues.apache.org/jira/browse/LUCENE-3930
> Project: Lucene - Java
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.6
>
> Attachments: LUCENE-3930-solr-example.patch, 
> LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
> LUCENE-3930.patch, ant_-verbose_clean_test.out.txt
>
>
> As mentioned on the ML thread: "switch jars to ivy mechanism?".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240918#comment-13240918
 ] 

Hoss Man commented on LUCENE-3930:
--

here's the tail end of "ant -verbose clean test"...

{noformat}
hossman@bester:~/lucene/branch_lucene3930$ ant -verbose clean test

...

compile-test-framework:
Skipped because property 'lucene.test.framework.compiled' set.

common.compile-test:
[mkdir] Skipping 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test 
because it already exists.
Property "run.clover" has not been set
[javac] org/apache/lucene/demo/TestDemo.java omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/TestDemo.class
 is up to date.
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache1.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache1.1.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache2.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/cpl1.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/epl1.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/freebsd.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl1.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl2.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl3.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lgpl2.1.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lgpl3.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lpgl2.0.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mit.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla1.1.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla_eula_firefox3.txt
 skipped - don't know how to handle it
[javac] 
/home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla_eula_thunderbird2.txt
 skipped - don't know how to handle it
 [copy] org/apache/lucene/demo/test-files/docs/apache1.0.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache1.0.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/apache1.1.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache1.1.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/apache2.0.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache2.0.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/cpl1.0.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/cpl1.0.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/epl1.0.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/epl1.0.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/freebsd.txt omitted as 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/freebsd.txt
 is up to date.
 [copy] org/apache/lucene/demo/test-files/docs/g

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240912#comment-13240912
 ] 

Hoss Man commented on LUCENE-3930:
--

I did a {{touch 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-FAKE.txt}} to 
work around the license checker and this time i wound up with an OOM...

{noformat}
hossman@bester:~/lucene/branch_lucene3930$ ant clean test

common.compile-test:
[mkdir] Created dir: 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test
[javac] Compiling 6 source files to 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
 [copy] Copied 1 empty directory to 1 empty directory under 
/home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test

test-contrib:
 [echo] Building demo...

download-ivy:

install-ivy:

resolve:
[ivy:retrieve] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ ::
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
java.lang.OutOfMemoryError: PermGen space
java.lang.OutOfMemoryError: PermGen space
at java.lang.Throwable.getStackTraceElement(Native Method)
at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
at java.lang.Throwable.printStackTrace(Throwable.java:462)
at java.lang.Throwable.printStackTrace(Throwable.java:451)
at org.apache.tools.ant.Main.startAnt(Main.java:230)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104)
{noformat}

running with "-verbose" to see if i can get more details on exactly where/why 
the OOM is happening


> nuke jars from source tree and use ivy
> --
>
> Key: LUCENE-3930
> URL: https://issues.apache.org/jira/browse/LUCENE-3930
> Project: Lucene - Java
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.6
>
> Attachments: LUCENE-3930-solr-example.patch, 
> LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
> LUCENE-3930.patch
>
>
> As mentioned on the ML thread: "switch jars to ivy mechanism?".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240864#comment-13240864
 ] 

Hoss Man commented on LUCENE-3930:
--

FWIW i did a completley clean checkout of the lucene3930 (r1306662) and got 
the following build failure trying to run "ant clean test" from the top level.

the ivy bootstraping doesn't seem to play nicely with the license checker...

{noformat}
hossman@bester:~/lucene/branch_lucene3930$ ant clean test
Buildfile: build.xml

clean:

clean:

clean:

clean:
 [echo] Building analyzers-common...

clean:
 [echo] Building analyzers-icu...

clean:
 [echo] Building analyzers-kuromoji...

clean:
 [echo] Building analyzers-morfologik...

clean:
 [echo] Building analyzers-phonetic...

clean:
 [echo] Building analyzers-smartcn...

clean:
 [echo] Building analyzers-stempel...

clean:
 [echo] Building analyzers-uima...

clean:
 [echo] Building benchmark...

clean:
 [echo] Building facet...

clean:
 [echo] Building grouping...

clean:
 [echo] Building join...

clean:
 [echo] Building queries...

clean:
 [echo] Building queryparser...

clean:
 [echo] Building spatial...

clean:
 [echo] Building suggest...

clean:
 [echo] Building solr...

clean:

validate:

compile-tools:

download-ivy:
[mkdir] Created dir: /home/hossman/lucene/branch_lucene3930/lucene/ivy
 [echo] 
 [echo]   NOTE: You do not have ivy installed, so downloading it...
 [echo]   If you make lots of checkouts, download ivy-2.2.0.jar 
yourself 
 [echo]   and set ivy.jar.file to this in your ~/build.properties
 [echo]   
  [get] Getting: 
http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
  [get] To: /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar

install-ivy:

resolve:
[ivy:retrieve] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ ::
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

compile-core:
[mkdir] Created dir: 
/home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java
[javac] Compiling 2 source files to 
/home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java
 [copy] Copying 1 file to 
/home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java

validate:
 [echo] License check under: /home/hossman/lucene/branch_lucene3930/lucene
 [licenses] MISSING LICENSE for the following file:
 [licenses]   /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar
 [licenses]   Expected locations below:
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-ASL.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-BSD.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-BSD_LIKE.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-CDDL.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-CPL.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-EPL.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-MIT.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-MPL.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-PD.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-SUN.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-COMPOUND.txt
 [licenses]   => 
/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-FAKE.txt
 [licenses] Scanned 1 JAR file(s) for licenses (in 0.31s.), 1 error(s).

BUILD FAILED
/home/hossman/lucene/branch_lucene3930/build.xml:42: The following error 
occurred while executing this line:
/home/hossman/lucene/branch_lucene3930/lucene/build.xml:178: The following 
error occurred while executing this line:
/home/hossman/lucene/branch_lucene3930/lucene/tools/custom-tasks.xml:22: 
License check failed. Check the logs.

Total time: 5 seconds
{noformat}

> nuke jars from source tree and use ivy
> --
>
> Key: LUCENE-3930
> URL: https://issues.apache.org/jira/browse/LUCENE-3930
> Project: Lucene - Java
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.6
>
> Attachments: LUCENE-3930-solr-example.patch, 
> LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
> LUCENE-3930.patch
>
>
> As mentioned on the ML thread: "switch jars to ivy mechanism?".

--
This message is automatically ge

[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml

2012-03-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240847#comment-13240847
 ] 

Hoss Man commented on SOLR-2724:


Whatever happens with this issue, please note SOLR-3292 and commit r1306642 
that was needed on the 3x branch to keep all aspects of the example working.

(i didn't make changes on trunk for SOLR-3292 since miller had already reverted 
SOLR-2724 there, SOLR-3292's changes should be included if/when SOLR-2724 is 
re-applied)

> Deprecate defaultSearchField and defaultOperator defined in schema.xml
> --
>
> Key: SOLR-2724
> URL: https://issues.apache.org/jira/browse/SOLR-2724
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis, search
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: 
> SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I've always been surprised to see the  element and 
>  defined in the schema.xml file since 
> the first time I saw them.  They just seem out of place to me since they are 
> more query parser related than schema related. But not only are they 
> misplaced, I feel they shouldn't exist. For query parsers, we already have a 
> "df" parameter that works just fine, and explicit field references. And the 
> default lucene query operator should stay at OR -- if a particular query 
> wants different behavior then use q.op or simply use "OR".
>  Seems like something better placed in solrconfig.xml than in the 
> schema. 
> In my opinion, defaultSearchField and defaultOperator configuration elements 
> should be deprecated in Solr 3.x and removed in Solr 4.  And  
> should move to solrconfig.xml. I am willing to do it, provided there is 
> consensus on it of course.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3287) 3x tutorial tries to demo schema features that don't work with 3x schema

2012-03-27 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240092#comment-13240092
 ] 

Hoss Man commented on SOLR-3287:


I don't have a "great" suggestion for dealing with this.   Fundementally it 
comes down to a conflict between trying to make the field types used by the 
example fields general and generic enough to be useful for any languages so 
people can re-use them, vs having fields in the example that let us show off 
some features that aren't neccessarily things all users will want in all of 
their text fields if they copy the schema.

we could use copyField to create "_en" versions of all these fields, but this 
type of solution has also lead to confusion/problems in the past, with people 
leaving those copyFields in the shchema.xml when they copy it, and winding up 
with indexes that are twice as big as they need to be.

My best suggestions are:

* For the search links in #1::
** leave the verbage as is, but maybe put this line in bold: *Go ahead and edit 
the schema.xml under the solr/example/solr/conf directory, and change the type 
for fields text and features from text_general to text_en_splitting* ... i 
would also suggest changing it to: *Go ahead and edit the schema.xml under the 
solr/example/solr/conf directory to use type="text_en_splitting" for the fields 
"text" and "features"*
** include a  box showing an example of what the  declarations 
will look like in XML if the user makes these changes
** i think we should also change the example queries so they aren't actually 
links -- just show the query syntax.  my thinking being that this will act as a 
metnal cue that these are examples of valid queries, but they don't work "out 
of the box"
* For the analysis.jsp link in #2: i think we should switch from using the 
"name=name" and "name=text" params to using "type=text_en" (with a tweak in 
verbage to make it clear what the URLs are showing) so these work even if the 
user doesn't edit the schema.

Anyone have any better ideas?


> 3x tutorial tries to demo schema features that don't work with 3x schema
> 
>
> Key: SOLR-3287
> URL: https://issues.apache.org/jira/browse/SOLR-3287
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Blocker
> Fix For: 3.6
>
>
> I just audited the tutorial on the 3x branch to ensure everything would work 
> for the 3.6 release, and ran into a two sections where things were very 
> confusing and seemed broken to me (even as a solr expert)
> https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/core/src/java/doc-files/tutorial.html
> 1) "Text Analysis" of the 5 queries in this section, only the "pixima" 
> example works (power-shot matches documents but not the ones the tutorial 
> suggests it should, and for different reasons).  The lead in para does 
> explain that you have to edit your schema.xml in order for these links to 
> work -- but it's confusing, and i honestly read it 3 times before i realized 
> what it was saying (the first two times i thought it was saying that 
> _because_ the content is in english, english specific field types are used, 
> and you can change those to text_general if you don't use english)
> Bottom line: the links are confusing since they don't work "out of the box" 
> with the simple commands shown so far
> {panel}
> If you know your textual content is English, as is the case for the example 
> documents in this tutorial, and you'd like to apply English-specific stemming 
> and stop word removal, as well as split compound words, you can use the 
> text_en_splitting fieldType instead. Go ahead and edit the schema.xml under 
> the solr/example/solr/conf directory, and change the type for fields text and 
> features from text_general to text_en_splitting. Restart the server and then 
> re-post all of the documents, and then these queries will show the 
> English-specific transformations: 
> * A search for power-shot matches PowerShot, and adata matches A-DATA due to 
> the use of WordDelimiterFilter and LowerCaseFilter.
> * A search for features:recharging matches Rechargeable due to stemming with 
> the EnglishPorterFilter.
> * A search for "1 gigabyte" matches things with GB, and the misspelled pixima 
> matches Pixma due to use of a SynonymFilter.
> {panel}
> * http://localhost:8983/solr/select/?indent=on&q=power-shot&fl=name
> * http://localhost:8983/solr/select/?indent=on&q=adata&fl=name
> * 
> http://localhost:8983/solr/select/?indent=on&q=features:recharging&fl=name,features
> * http://localhost:8983/solr/select/?indent=on&q=%221%20gigabyte%22&fl=name
> * http://localhost:8983/solr/select/?indent=on&q=pixima&fl=name
> 2) "Analysis Debugging"
> Likewise, all of the analysis.jsp example URLs attempt to sh

[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter

2012-03-27 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239605#comment-13239605
 ] 

Hoss Man commented on SOLR-435:
---

bq. Why not simply apply the SOLR-2001 patch for consistent behavior?

good question ... if you're cool with that then it seems okay to me (although 
off the top of my head i think when i was looking at trunk one of those 4 
"main" parsers still needed "fixed" to return null).

in general my suggestion for 3.6 was largely based on the fact that there was 
still active discussion about what the best long term behavior was, which might 
contradict what was discussed in SOLR-2001, so better to play it safe and just 
clean up the error reporting: "I'd rather leave things the way they are then 
make a bad decision in a hurry"

if you want to backport SOLR-2001 and sanity check that 
lucene/dismax/edismax/lucenePlusSort all return null on null/blank query 
strings i'm +1 to that (that seems consistent with what ryan/male/me were 
advocating here, so as long as your on board i think we're good)


> QParser must validate existence/absence of "q" parameter
> 
>
> Key: SOLR-435
> URL: https://issues.apache.org/jira/browse/SOLR-435
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Ryan McKinley
>Assignee: David Smiley
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-435.patch, SOLR-435_3x_consistent_errors.patch, 
> SOLR-435_q_defaults_to_all-docs.patch
>
>
> Each QParser should check if "q" exists or not.  For some it will be required 
> others not.
> currently it throws a null pointer:
> {code}
> java.lang.NullPointerException
>   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
>   at 
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
>   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
>   at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
>   at 
> org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
> ...
> {code}
> see:
> http://www.nabble.com/query-parsing-error-to14124285.html#a14140108

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter

2012-03-26 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239087#comment-13239087
 ] 

Hoss Man commented on SOLR-435:
---

bq. I agree but I also think we should commit the improved error message 
suggested by David so that we avoid the unhelpful NPE. Any broader changes will 
be in 4.0 so we don't have a backwards compat problem.

Grrr... yes, i see ... SOLR-2001 is only on trunk, somehow i overlooked that 
and it contributed to my confusion as to some of the comments in this issue.

So instead of NPEs or what not that you get in 3.5 from various parsers, we 
switch to consistent'new ParseException("missing query string");' in 3.6, and 
address if there can be better default handling in 4.0 (continuing what 
SOLR-2001 started)

+1

> QParser must validate existence/absence of "q" parameter
> 
>
> Key: SOLR-435
> URL: https://issues.apache.org/jira/browse/SOLR-435
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Ryan McKinley
>Assignee: David Smiley
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-435.patch, SOLR-435_q_defaults_to_all-docs.patch
>
>
> Each QParser should check if "q" exists or not.  For some it will be required 
> others not.
> currently it throws a null pointer:
> {code}
> java.lang.NullPointerException
>   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
>   at 
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
>   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
>   at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
>   at 
> org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
> ...
> {code}
> see:
> http://www.nabble.com/query-parsing-error-to14124285.html#a14140108

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter

2012-03-26 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239023#comment-13239023
 ] 

Hoss Man commented on SOLR-435:
---

bq. Again I agree. But I'm just not sure if that validation / error checking 
should involve checking alternative parameters. That feels like its defeating 
the goal of QParsers working in all situations.

Not sure i see the problem, ... part of the advantage in how q.alt it's 
implemented now is that you can put things like...
{noformat}
 !dismax q.alt=*:* v=$keywords}
{noformat}
...into "appends" params in your solrconfig.  by default nothing is filtered, 
but if the client provides a "keywords" param then it's used.

bq. I just also wonder whether down the line we want better error messages here 
too. David's suggestion for "missing query string" aligns with other such 
messages

It wouldn't have to ... parse() can throw ParseExceptions and QueryCOmponent 
(or whatever delegated to the QParser can wrap that in a user error 
(QueryCOmponent already does that)


> QParser must validate existence/absence of "q" parameter
> 
>
> Key: SOLR-435
> URL: https://issues.apache.org/jira/browse/SOLR-435
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Ryan McKinley
>Assignee: David Smiley
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-435_q_defaults_to_all-docs.patch
>
>
> Each QParser should check if "q" exists or not.  For some it will be required 
> others not.
> currently it throws a null pointer:
> {code}
> java.lang.NullPointerException
>   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
>   at 
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
>   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
>   at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
>   at 
> org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
> ...
> {code}
> see:
> http://www.nabble.com/query-parsing-error-to14124285.html#a14140108

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins

2012-03-26 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238710#comment-13238710
 ] 

Hoss Man commented on SOLR-3268:


i think you're talking about SOLR-3112 which was dealt with .. but even if 
there are others, we can start by adding this check now, and then file issues 
to fix & remove whatever is left.

this isn't a silver bullet, it's certainly not as good as actually looking down 
src to fail on writes, but it will at least force people to be aware if/when 
they add a test that pollutes src/

> remove write acess to source tree (chmod 555) when running tests in jenkins
> ---
>
> Key: SOLR-3268
> URL: https://issues.apache.org/jira/browse/SOLR-3268
> Project: Solr
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-3268_sync.patch
>
>
> Some tests are currently creating files under the source tree.
> This causes a lot of problems, it makes my checkout look dirty after running 
> 'ant test' and i have to cleanup.
> I opened and issue for this a month in a half for 
> solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), 
> but now we have a second file 
> (core/src/test-files/solr/conf/elevate-data-distrib.xml).
> So I think hudson needs to chmod these src directories to 555, so that solr 
> tests that do this will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins

2012-03-26 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238685#comment-13238685
 ] 

Hoss Man commented on SOLR-3268:


if locking down src/ so tests can't make changes is tricky to do safely, 
perhaps a we could do something simpler to get us part way towards the ultimate 
goal? ... add a final step to the jenkins build script that fails if "svn 
status | wc -l" returns non-zero?

it wont't ensure no changes are made to src/, but it should ensure no changes 
are made to src/ unless explicitly allowed by an svn:ignore ... then we just 
have to (remove any existing svn:ignore under /src and) make sure we publicly 
shame anyone who adds svn:ignores to src/ because they wrote a sloppy test.

> remove write acess to source tree (chmod 555) when running tests in jenkins
> ---
>
> Key: SOLR-3268
> URL: https://issues.apache.org/jira/browse/SOLR-3268
> Project: Solr
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-3268_sync.patch
>
>
> Some tests are currently creating files under the source tree.
> This causes a lot of problems, it makes my checkout look dirty after running 
> 'ant test' and i have to cleanup.
> I opened and issue for this a month in a half for 
> solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), 
> but now we have a second file 
> (core/src/test-files/solr/conf/elevate-data-distrib.xml).
> So I think hudson needs to chmod these src directories to 555, so that solr 
> tests that do this will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-435) QParser must validate existance/absense of "q" parameter

2012-03-26 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238538#comment-13238538
 ] 

Hoss Man commented on SOLR-435:
---

bq. If the purpose of the QueryComponent is to be QParser agnostic and 
consequently unable to know if the 'q' parameter is even relevant, shouldn't it 
be up to the QParser to retrieve what it believes the query string to be from 
the request parameters?

Sorry ... i chose my words carelessly and wound up saying almost the exact 
opposite of what i ment.

What i should have said...

* QueryComponent is responsible for determining the QParser to use for the main 
query and passing it the value of the q query-string param to the  
QParser.getParser(...) method
* QParser.getParser passes that query-string on to whater QParserPlugin was 
selected as the "qstr" param to the createParser
* The QParser that gets created by the createParser call should do whatever 
validation it needs to do (including a null check) in it's parse() method

In answer to your questions...

* QueryComponent can not do any validation of the q param, because it can't 
make any assumptions about what the defType QParser this are legal values -- 
not even a null check, because in case of things like dismax nll is perfectly 
fine
* QParsers (and QParserPlugins) can't be made responsible for fetching the q 
param because they don't know if/when they are being used to parse the main 
query param, vs fq params, vs some other nested subquery
* by putting this kind of validation/error checking in the QParser.parse 
method, we ensure that it is used properly even when the QParser(s) are used 
for things like 'fq' params or in nested subqueries

bq. Hoss: I don't agree with your reasoning on the developer-user typo-ing the 
'q' parameter. If you mistype basically any parameter then clearly it is as if 
you didn't even specify that parameter and you get the default behavior of the 
parameter you were trying to type correctly but didn't. 

understood ... but most other situations the "default" behavior is either "do 
nothing" or "error" ... we don't have a lot of default behaviors which are 
"give me tones of stuff" ... if you use {{facet=true&faceet.field=foo}} (note 
the extra character) you don't silently get get faceting on every field as a 
default -- you get no field faceting at all. if you misstype the q param name 
and get an error on your first attempt you immediately understand you did 
something wrong.  likewise if we made the default a "matches nothing" query, 
then you'd get no results and (hopefully) be suspicious enough to realize you 
made a mistake -- but if we give you a bunch of results by default you may not 
realize at all that you're looking at all results not just the results of what 
you thought the query was.  the only situations i can think of where forgetting 
or mistyping a param name doens't default to error or nothing are things with 
fixed expectations: start, rows, fl, etc...  Those have defaults that (if they 
don't match what you tried to specify) are immediately obvious ... the 'start' 
attribute on the docList returned is wrong, you get more results then you 
expected, you get field names you know you didn't specify, etc...  it's less 
obvious when you are looking at the results of a query that it's a match-all 
query instead of the query you thought you were specifying.

like i said ... i'm -0 to having a hardcoded default query for 
lucene/dismax/edismax ... if you feel strongly about it that's fine, allthough 
i would try to convince you "match none" is a better hardcoded default then 
'match all' (so that it's easier to recognize mistakes quickly) and really 
don't think we should do it w/o also add q.alt support to the LuceneQParser so 
people can override it.



> QParser must validate existance/absense of "q" parameter
> 
>
> Key: SOLR-435
> URL: https://issues.apache.org/jira/browse/SOLR-435
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-435_q_defaults_to_all-docs.patch
>
>
> Each QParser should check if "q" exists or not.  For some it will be required 
> others not.
> currently it throws a null pointer:
> {code}
> java.lang.NullPointerException
>   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
>   at 
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
>   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
>   at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
>   at 
> org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
> 

[jira] [Commented] (SOLR-435) QParser must validate existance/absense of "q" parameter

2012-03-23 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237366#comment-13237366
 ] 

Hoss Man commented on SOLR-435:
---

bq. if no query string is supplied, or if its blank or just whitespace, then 
the default is to match all documents. 

-0 ... the risk with this approach is that (new) users who make typos in 
queries or are missinformed about the name "q" param (ie: {{/search?Q=foo}} or 
{{/search?query=foo}}) will be really confused when they query they specify is 
completely ignored w/o explanation and all docs are returned in it's place.  I 
think it's much better to throw a clear error "q param is not specified" but i 
certainly see the value in adding q.alt support to the LuceneQParser with the 
same semantics as dismax (used if q is missing or all whitespace) .. not sure 
why we've never considered that before.  (obviosly it wouldn't make sense for 
all QParsers, like "field" or "term" since all whitespace and or empty strings 
are totally valid input for them)

bq. I could have modified QueryComponent, or just QParser, or just the actual 
QParser subclasses. A universal choice couldn't be made for all qparsers...

we definitely shouldn't modify QueryComponent ... the entire point of the issue 
is that QueryComponent can't attempt to validate the q param, because it 
doesn't know if/when the defType QParser requires it to exist -- the individual 
QParsers all need to throw clear errors if they require it and it's not 
specified, that's really the whole reason this issue was opened in the first 
place

> QParser must validate existance/absense of "q" parameter
> 
>
> Key: SOLR-435
> URL: https://issues.apache.org/jira/browse/SOLR-435
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-435_q_defaults_to_all-docs.patch
>
>
> Each QParser should check if "q" exists or not.  For some it will be required 
> others not.
> currently it throws a null pointer:
> {code}
> java.lang.NullPointerException
>   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
>   at 
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
>   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
>   at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
>   at 
> org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
> ...
> {code}
> see:
> http://www.nabble.com/query-parsing-error-to14124285.html#a14140108

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3255) OpenExchangeRates.Org Exchange Rate Provider

2012-03-22 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236260#comment-13236260
 ] 

Hoss Man commented on SOLR-3255:


looks pretty cool, but that listAvailableCurrencies smells kind of fishy in 
general, and with this patch smells even fishier (depending on the arg, it 
either returs a list of string codes, or a list of string code perumtations 
with a comma separator?)

If we're seeing now, with multiple Provider impls, that the API doens't make 
sense -- we should bite the bullet and change it before it's public.

perhaps two methods: listAvailableCurrencies() that returns a Set and 
listAvailableConversions that returns Map ?


> OpenExchangeRates.Org Exchange Rate Provider
> 
>
> Key: SOLR-3255
> URL: https://issues.apache.org/jira/browse/SOLR-3255
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 3.6, 4.0
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>  Labels: CurrencyField
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3255.patch, SOLR-3255.patch, SOLR-3255.patch
>
>
> An exchange rate provider for CurrencyField using the freely available feed 
> from http://openexchangerates.org/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3127) Dismax to honor the KeywordTokenizerFactory when querying with multi word strings

2012-03-22 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235940#comment-13235940
 ] 

Hoss Man commented on SOLR-3127:


whitespace is a significant meta character to dismax (and for that matter, the 
main lucene QUeryParser as well) ... it indicates the seperation betwen 
optional clauses.

the query parsing structure is independent of the analyzer used, so the fact 
that a  KeywordTokenizerFactory is used on the field in question is irrelevant, 
you might have another qf that doens't have KeywordTokenizerFactory so even if 
dismax tried to guess that it should treat the entire nput as all one string, 
it couldn't do that for other fields.

if you wnat your entire input to be treated as a literal, without treating 
whitespace as a meta-character, it needs to be quoted, or consider using an 
alternative parser (ie: the "field" QParser is designed for this type of "i 
want to query a single field for a specific value" type situation.

> Dismax to honor the KeywordTokenizerFactory when querying with multi word 
> strings
> -
>
> Key: SOLR-3127
> URL: https://issues.apache.org/jira/browse/SOLR-3127
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 3.5
>Reporter: Zac Smith
>Priority: Minor
>  Labels: dismax
>
> When using the KeywordTokenizerFactory with a multi word search string, the 
> dismax query created is not very useful. Although the query analzyer doesn't 
> tokenize the search input, each word of the input is include in the search.
> e.g. if searching for 'chicken stock' the dismax query created would be:
> +(DisjunctionMaxQuery((ingredient_synonyms:chicken^0.6)~0.01) 
> DisjunctionMaxQuery((ingredient_synonyms:stock^0.6)~0.01)) 
> DisjunctionMaxQuery((ingredient_synonyms:chicken stock^0.6)~0.01)
> Note that although the query analyzer does not tokenize the term 'chicken 
> stock' into 'chicken' and 'stock', they are still included and required in 
> the search term.
> I think the query created should be just:
> DisjunctionMaxQuery((ingredient_synonyms:chicken stock)~0.01)
> (or at least not have the individual terms as should match, not must match so 
> you could configure with MM.
> Example field type:
>  positionIncrementGap="100" autoGeneratePhraseQueries="false">
>   
>   
>   
>   
>   
>   
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3259) Solr 4 aesthetics

2012-03-20 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234012#comment-13234012
 ] 

Hoss Man commented on SOLR-3259:


bq. the concept of an "example" server that you must configure yourself has 
become less than ideal... perhaps we should just create a "server" directory 
(but leave things like exampledocs under example)
bq. Would also be nice to remove the "multicore" directory in there (since the 
normal server is already multi-core enabled). Of course if we moved just the 
essential parts to "server", then "multicore", "example-DIH" and "exampledocs" 
would all be left behind in "example", as they should be.

once upon a time, i argued heavily that we should renamed "example" to 
"examples" and have many more of them ("minimal", "tech-products-from-the-90s", 
"books", "kitchen-sink", "multicore", "dih", etc...) .. and the only reason i 
never pushed harder for this was because that kind of directory structure would 
have made running the "main" example (whatever we might have called it) much 
harder then "java -jar start.jar" because it would have required specifying 
solr.solr.home.

Thinking about it some more now that you've brought this issue up, it occurs to 
me that in that intervening time, multicore solr setups have been come the 
norm, not the exception ... so i'm now much more on board the idea of having a 
single example setup -- even calling it "server" if people think that's a good 
idea -- provided we move all of the various examples we have into that example 
setup as multiple cores.

they don't have to all be enabled, they don't even have to be commented out in 
solr.xml, it would just be nice if they lived in the same directory and could 
easily be added as new cores with a simple CREATE commands a relative paths.

So by default if you did "cd server && java -jar start.jar" you might only get 
one collection (or maybe two if we also want to show off a "minimal" 
collection) but the docs for features like DIH might say "to see an example of 
this, hit the following URL to create a new collection using the DIH configs: 
http://localhost...CREATE";

bq. If anything I'd vote for making the distro closer to what people would want 
in production. You could then have a pure "solr/jetty" folder with ONLY jetty, 
a "solr/example-home" folder which holds todays "example/solr" ... 
start-solr.[cmd|sh], which copies the war from dist to jetty/webapps, sets 
-Dsolr.solr.home and starts Jetty

while i like the idea of creating clearer/cleaner separation between what files 
are "jetty" and what files are "solr" and what files are "config" i'm not a fan 
of your specific suggestions here because it moves away from the really clean 
simplicity of "cd something && java -jar start.jar" -- which make it *really* 
trivial for anyone anywhere to try solr out regardless of their os, or what 
weird eccentricities are in their shell, or whether the file perms on some 
scripts are correct, etc...

if we want to start shipping init.d scripts and what not for "production usage" 
(something we typically avoided in the past because there are so many different 
ways people like to do these things, not to mention that many people like to 
use tomcat or some other servlet container) then that seems like it 
could/should really be distinct from how people run the example for the 
tutorial ... it shouldn't be completely orthogonal, we should be able to say 
something like: "if you copy this ./server/ directory to some place on your 
production server, this directory of ./service-tools/ can be used to 
automatically start/stop when your machine comes up or down, just configure the 
path where you copied ./server/" ... but people shouldn't have to use some 
start.sh just to try out the tutorial not compared to how easy "java -jar 
start.jar" is today.

> Solr 4 aesthetics
> -
>
> Key: SOLR-3259
> URL: https://issues.apache.org/jira/browse/SOLR-3259
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
> Fix For: 4.0
>
>
> Solr 4 will be a huge new release... we should take this opportunity to 
> improve the out-of-the-box experience.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3207) Add field name validation

2012-03-15 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230689#comment-13230689
 ] 

Hoss Man commented on SOLR-3207:


the giant elephant in the room that doesn't seem to have been discussed is that 
trying to validate that field names meet some strict criteria when loading 
schema.xml doesn't really address dynamic fields -- the patch ensures that 
 configurations have names which are validated, but i 
don't see anything that would considering the actually field names people use 
with those dynamic fields -- ie: "*_i" might be a perfectly valid dynamicField 
at startup, but that startup validation isn't going to help me if i index a 
document containing the field name "{{$ - foo_i}}"

In general, i'm opposed to the idea of "locking down" what field names can be 
used across the board.  My preference would be to let people us any field name 
their heart desires, but provide better documentation on what field name 
restrictions exist on which features and provide (ie: "using a field name in 
function requires that the field name match ValidatorX; using a field name in 
fl requires can only be used with field names conform to ValidatorX and 
ValidatorY; etc...").

If we want to provide automated "validation" of these things for people, then 
let's make it part of the LukeRequestHandler: for any field name returned by 
the LukeRequestHandler, let's include a warnings section advising them which 
validation rules that field name doesn't match, and what features depend on 
that validation rule -- this info could then easily be exposed in the admin UI.

We could also provide an optional UpdateProcessor people could configure with a 
list of individual Validators which could reject any document containing a 
field that didn't match the validator (or optionally: translate the field name 
to something thta did conform) to help people enforce these even on dynamic 
fields.

So by default: any field is allowed, but if i create one with a funky name 
(either explicitly or as a result of loading data using a dynamicField) the 
admin UI starts warning me that feature XYZ won't work with fields A, B, C; and 
if i want to ensure feature D works will all of my fields i add an update 
processor to ensure it.


> Add field name validation
> -
>
> Key: SOLR-3207
> URL: https://issues.apache.org/jira/browse/SOLR-3207
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Luca Cavanna
> Fix For: 4.0
>
> Attachments: SOLR-3207.patch
>
>
> Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would 
> be useful to add some kind of validation regarding the field names you can 
> use on Solr.
> The objective would be adding consistency, allowing only field names that you 
> can then use within fl, sorting etc.
> The rules, taken from the actual StrParser behaviour, seem to be the 
> following: 
> - same used for java identifiers (Character#isJavaIdentifierPart), plus the 
> use of trailing '.' and '-'
> - for the first character the rule is Character#isJavaIdentifierStart minus 
> '$' (The dash can't be used as first character (SOLR-3191) for example)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230659#comment-13230659
 ] 

Hoss Man commented on SOLR-3251:


a low level implementation detail i would worry about is "snapshoting" the 
schema for the duration of a single request .. i suspect there are more then a 
few places in solr that would generate weird exceptions if multiple calls to 
"req.getSchema().getFields()" returned different things in the middle of 
processing a single request.

> dynamically add field to schema
> ---
>
> Key: SOLR-3251
> URL: https://issues.apache.org/jira/browse/SOLR-3251
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
> Attachments: SOLR-3251.patch
>
>
> One related piece of functionality needed for SOLR-3250 is the ability to 
> dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3232) Admin UI: query form should have a menu to pick a request handler

2012-03-15 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230656#comment-13230656
 ] 

Hoss Man commented on SOLR-3232:


bq. Anyway, I'll let it go, and just roll my eyes at all the hacks and 
duplication that building an entirely Ajax UI using pure JSON responses entails.

I've said it before and i'll say it again: the single most important reason why 
i think the javascript based Admin UI is a great idea is becuase it *forces* us 
to make sure all of the info needed to build the admin UI is available via HTTP 
using request handlers and what no -- ensuring that we think about how other 
clients can programmatically access the same information. the old JSPs and the 
velocity engine generated pages had too much direct access to internals, making 
it too easy to overlook when external clients didn't have access to useful data.

bq. How about SolrInfoMBeanHandler.java adds a simple "searchHandler" attribute 
with true/false?

I think it would be just as easy and far more generally useful add a request 
param to SolrInfoMBeanHandler that would let you filter the objects by what 
class's they are an instance of just like it can filter by "cat" and "key" 
right now (ie: "/admin/mbeans?class=solr.SearchHandler").

--

As far as this issue in general: i think it's a good idea to add a pulldown to 
make it more friendly to folks and easier to use in the common case, and 
populating the pulldown with all the instances of SerachHandler makes a lot of 
sense, but we should try to use some UI element that will allows people to type 
in their own handler name if they want (ie: http://jsfiddle.net/6QeXU/3/ but 
i'm sure thers a clearner more efficient way to do it) so we don't anoy people 
who have their own custom RequestHandlers that don't subclass SerachHandler, or 
want to use things like MoreLikeThisHandler, etc...)

(Longer term, it would be to make querying the AdminHandler return all sorts of 
useful introspection info about what is currently running to drive the UI 
screen generation, with optional config on the handler to override things maybe 
i don't wnat to advertise some search handler instance?) along the lines of 
this brainstorming doc i write a long, long time ago: 
http://wiki.apache.org/solr/MakeSolrMoreSelfService#Request_Handler_Param_Docs)

> Admin UI: query form should have a menu to pick a request handler
> -
>
> Key: SOLR-3232
> URL: https://issues.apache.org/jira/browse/SOLR-3232
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: David Smiley
>Assignee: Stefan Matheis (steffkes)
> Fix For: 4.0
>
> Attachments: SOLR-3232.patch
>
>
> The query form in the admin UI could use an improvement regarding how the 
> request handler is chosen; presently all there is is a text input for 'qt'.  
> The first choice to make in the form above the query should really be the 
> request handler since it actually handles the request before any other 
> parameters do anything.  It'd be great if it was a dynamically driven menu 
> defaulting to "/select".  Similar to how the DIH page finds DIH request 
> handlers, this page could find the request handlers with a class of 
> "SearchHandler".  Their names would be added to a list, and if the name 
> didn't start with a '/' then it would be prefixed with '/select?qt='.
> I did something similar (without the menu) to the old 3x UI in a patch to 
> SOLR-3161 which will hopefully get committed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230627#comment-13230627
 ] 

Hoss Man commented on SOLR-3251:


bq. Any ideas for an external API?

I think the best way to support this externally is using the existing mechanism 
for plugins...

* a RequestHandler people can register (if they want to support external 
clients programaticly modifying the schema) that accepts ContentStreams 
containing whatever payload structure makes sense given the functionality.
* an UpdateProcessor people can register (if they want to support stuff like 
SOLR-3250 where clients adding documents can submit any field name and a type 
is added based on the type of hte value) which could be configured with 
mappings of java types to fieldTypes and rules about other field attributes -- 
ie "if a client submits a new field=value with a java.lang.Integer value, 
create a new "tint" field with that name and set stored=true.

> dynamically add field to schema
> ---
>
> Key: SOLR-3251
> URL: https://issues.apache.org/jira/browse/SOLR-3251
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
> Attachments: SOLR-3251.patch
>
>
> One related piece of functionality needed for SOLR-3250 is the ability to 
> dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms

2012-03-14 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229298#comment-13229298
 ] 

Hoss Man commented on SOLR-3241:


patch looks fine ... i wish there was a way to make it easier for poly fields 
so they wouldn't have do do the check themselves, but when i tried the idea i 
had it didn't work, so better to go with this for now and maybe refactor a 
helper method later.

the few changes i would make:

1) make the new tests grab the IndexSchema obejct and assert that every field 
(that the cares about) has the expected omitNorms value -- future proof 
ourselves against someone nuetering the test w/o realizing by tweaking the test 
schema because they don't know that there is a specific reason for those 
omitNorm settings

2) add a test that explicitly verifies the failure case of someone setting 
field boost on a field with omitNorms==true, assert that we get the expected 
error mesg (doesn't look like this was added when LUCENE-3796 was commited, and 
we want to make sure we don't inadvertantly break that error check)



> Document boost fail if a field copy omit the norms
> --
>
> Key: SOLR-3241
> URL: https://issues.apache.org/jira/browse/SOLR-3241
> Project: Solr
>  Issue Type: Bug
>Reporter: Tomás Fernández Löbbe
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3241.patch, SOLR-3241.patch, SOLR-3241.patch, 
> SOLR-3241.patch
>
>
> After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible 
> to set a boost to a field that has the "omitNorms" set to true. This is 
> making Solr's document index-time boost to fail when a field that doesn't 
> omit norms is copied (with copyField) to a field that does omit them and 
> document boost is used. For example:
>  omitNorms="false"/>
>  omitNorms="true"/>
> 
> I'm attaching a possible fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms

2012-03-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228924#comment-13228924
 ] 

Hoss Man commented on SOLR-3241:


bq. The reason the logic was somewhat complicated in DocumentBuilder is 
because, from the lucene indexer its easy to detect this case, but:

sure ... but i think it's not actually just "Document Boost" is it?  if field 
"foo" is declared with omitNorms==false, and a client sends a doc with a field 
"foo" using a fieldBoost then that should be totally fine -- but if the schema 
says to copyField from foo->bar where bar has omitNorms==true then i think that 
will currently cause an from the lucene low level check, corret? (i haven't 
tried it, i'm going based on tomas's path) likewise if "foo" is a LatLonField 
(or any other polyfield) and the underlying dynamic field used has 
omitNorms==true then won't that same low level lucene code throw an error there?

so multiple error paths from totally sane usage none of which has anything to 
do with doc boost, right?

(Truth be told, i didn't even notice the "Document boost" part of the summary, 
i was just looking at tomas's patch and skimming the summary)

> Document boost fail if a field copy omit the norms
> --
>
> Key: SOLR-3241
> URL: https://issues.apache.org/jira/browse/SOLR-3241
> Project: Solr
>  Issue Type: Bug
>Reporter: Tomás Fernández Löbbe
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3241.patch
>
>
> After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible 
> to set a boost to a field that has the "omitNorms" set to true. This is 
> making Solr's document index-time boost to fail when a field that doesn't 
> omit norms is copied (with copyField) to a field that does omit them and 
> document boost is used. For example:
>  omitNorms="false"/>
>  omitNorms="true"/>
> 
> I'm attaching a possible fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms

2012-03-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228914#comment-13228914
 ] 

Hoss Man commented on SOLR-3241:


part of me things we should just remove the error checking for {{omitNorms && 
boost != 1.0F}} from DocumentBuilder.toDocument (added in /LUCENE-3796) and 
just silently ignore any boost on a SolrInputField where omitNorms==true (ie: 
maybe log a warning, but don't throw an Exception).  This would be consistent 
with the behavior in past releases (except for the warning log if we add that), 
and wouldn't cause any confusing errors for things like LatLonType (even if 
they come from third-party plugins we can't contro/test)

On the other hand... that feels really dirty, and it would be nice to fail fast 
and loud if the client tries to set a boost on an omitNorms field

Perhaps a better fix would be to leave DocumentBuilder exactly as it is today, 
and instead change FieldType.createField to (silently) ignore the boost if 
omitNorms==true for that SchemaField.  if i'm thinking about this right, that 
would mean the error checking of the SolrInputDocument (and all it's 
SolrInputFields) in DocumentBuilder.toDocument would still work as designed -- 
so you'd get an error if any client or "high level" plugin like an 
UpdateProcessor tried to use a field boost on an omitNorms field; but any 
fields added at a lower level (ie: by copyField or a poly field) would silently 
ignore those boosts if they were copied/cloned to a field where omitNorms==true.







> Document boost fail if a field copy omit the norms
> --
>
> Key: SOLR-3241
> URL: https://issues.apache.org/jira/browse/SOLR-3241
> Project: Solr
>  Issue Type: Bug
>Reporter: Tomás Fernández Löbbe
> Fix For: 4.0
>
> Attachments: SOLR-3241.patch
>
>
> After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible 
> to set a boost to a field that has the "omitNorms" set to true. This is 
> making Solr's document index-time boost to fail when a field that doesn't 
> omit norms is copied (with copyField) to a field that does omit them and 
> document boost is used. For example:
>  omitNorms="false"/>
>  omitNorms="true"/>
> 
> I'm attaching a possible fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3218) Range faceting support for CurrencyField

2012-03-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228644#comment-13228644
 ] 

Hoss Man commented on SOLR-3218:


bq. I updated CurrencyValue.toString() to return "3.14,USD" for $3.14 rather 
than "314,USD". My feeling is that it's more straight forward to return strings 
that look like the values that were passed in to parse(). 

that sounds right.

the most important thing is that in the response from range faceting, where it 
gives you a (str) lower bound about a count, that lower bound should be a legal 
value when building a query against that field (ie: you can use it in a range 
query)  ... i'm pretty sure (if i understand correctly) that for CurrencyField 
that means "3.14,USD"

bq. I worry that relaxing the restriction on the gap may just be confusing 
without adding any real value. We may want to consider forcing gap to be the 
same as start and end so that things are more conceptually straight forward.

I believe you -- i've got no objection to locking that down, i just want to 
make sure that if we doc "you can't do this" that: a) the code actually fails 
if you try; and b) we have a test proving that the code will fail if you try.

(and if we decide later that it makes sense, we can relax things and change the 
test & docs)


> Range faceting support for CurrencyField
> 
>
> Key: SOLR-3218
> URL: https://issues.apache.org/jira/browse/SOLR-3218
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: Jan Høydahl
> Fix For: 4.0
>
> Attachments: SOLR-3218-1.patch, SOLR-3218-2.patch, SOLR-3218.patch, 
> SOLR-3218.patch
>
>
> Spinoff from SOLR-2202. Need to add range faceting capabilities for 
> CurrencyField

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8

2012-03-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226349#comment-13226349
 ] 

Hoss Man commented on SOLR-3159:


bq. Wierdly, I don't see an annoucement for v8.1.2

Sorry .. poor wording on my part: the issue is marked fixed in 8.1.2 but the 
jetty Jira system lists 8.1.2 as unreleased (ie: fixed on jetty's 8.1 branch 
for hte next release i guess)

---

Another little glitch i just noticed is that aparently with the new jetty 
configs JSP support isn't enabled?  loading http://localhost:8983/solr/ works 
fine, but http://localhost:8983/solr/admin/dataimport.jsp gives a 500 error 
"JSP support not configured" 



> Upgrade to Jetty 8
> --
>
> Key: SOLR-3159
> URL: https://issues.apache.org/jira/browse/SOLR-3159
> Project: Solr
>  Issue Type: Task
>Reporter: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3159-maven.patch
>
>
> Solr is currently tested (and bundled) with a patched jetty-6 version.  
> Ideally we can release and test with a standard version.
> Jetty-6 (at codehaus) is just maintenance now.  New development and 
> improvements are now hosted at eclipse.  Assuming performance is equivalent, 
> I think we should switch to Jetty 8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8

2012-03-08 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225675#comment-13225675
 ] 

Hoss Man commented on SOLR-3159:


https://jira.codehaus.org/browse/JETTY-1489 - apparently fixed in 8.1.2 ?



> Upgrade to Jetty 8
> --
>
> Key: SOLR-3159
> URL: https://issues.apache.org/jira/browse/SOLR-3159
> Project: Solr
>  Issue Type: Task
>Reporter: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3159-maven.patch
>
>
> Solr is currently tested (and bundled) with a patched jetty-6 version.  
> Ideally we can release and test with a standard version.
> Jetty-6 (at codehaus) is just maintenance now.  New development and 
> improvements are now hosted at eclipse.  Assuming performance is equivalent, 
> I think we should switch to Jetty 8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8

2012-03-08 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225674#comment-13225674
 ] 

Hoss Man commented on SOLR-3159:


FWIW: on trunk, using an svn checkout, and runing "java -jar start.jar" i'm 
getting the following error in the jetty logging after solr starts up...

{noformat}
2012-03-08 15:16:09.382:WARN:oejw.WebAppContext:Failed startup of context 
o.e.j.w.WebAppContext{/.svn,file:/home/hossman/lucene/dev/solr/example/webapps/.svn/},/home/hossman/lucene/dev/solr/example/webapps/.svn
java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:686)
at 
org.eclipse.jetty.util.log.StdErrLog.condensePackageString(StdErrLog.java:210)
at org.eclipse.jetty.util.log.StdErrLog.(StdErrLog.java:105)
at org.eclipse.jetty.util.log.StdErrLog.(StdErrLog.java:97)
at org.eclipse.jetty.util.log.StdErrLog.newLogger(StdErrLog.java:569)
at 
org.eclipse.jetty.util.log.AbstractLogger.getLogger(AbstractLogger.java:21)
at org.eclipse.jetty.util.log.Log.getLogger(Log.java:438)
at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:677)
at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
at 
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
at 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
at 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at 
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
at 
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at 
org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:58)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
at org.eclipse.jetty.server.Server.doStart(Server.java:263)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at 
org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
at org.eclipse.jetty.start.Main.start(Main.java:602)
at org.eclipse.jetty.start.Main.main(Main.java:82)
{noformat}

...solr is functioning just fine, but it seems like something has changed 
subtley in either how jetty handles the webapps dir, or how we have it 
configured to handle the webapps dir, such that it is trying to load .svn as a 
webapp.



> Upgrade to Jetty 8
> --
>
> Key: SOLR-3159
> URL: https://issues.apache.org/jira/browse/SOLR-3159
> Project: Solr
>  Issue Type: Task
>Reporter: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3159-maven.patch
>
>
> Solr is currently tested (and bundled) with a patched jetty-6 version.  
> Ideally we can release and test with a standard version.
> Jetty-6 (at codehaus) is just maintenance now.  New development and 
> improvements are now hosted at eclipse.  Assuming per

[jira] [Commented] (SOLR-3218) Range faceting support for CurrencyField

2012-03-08 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225639#comment-13225639
 ] 

Hoss Man commented on SOLR-3218:


bq. I believe start/end currency equality is enforced by MoneyType.compareTo 
which will throw an exception when end is compared to the first (start+gap).

Ah ..ok.  and then ultimately start+gap is compared to end (even if hardend is 
false) so you'll get a exception then.  ok fair enough.

bq. As far as enforcing currency equality being a good idea or not, it would 
make sense and I would prefer if start/end/gap currencies didn't need to be 
equal. This patch doesn't allow for that given the tradeoff of the utility of 
being able to use different currencies versus the annoyance of keeping a handle 
open to an ExchangeRateProvider in the places we'd need it.

I'm not completley understanding, but i don't need to: If it's easier/simpler 
(for now) to require that start/end/gap are all in the same currency that's 
fine -- we should just test/document that clearly .. we can alwasy relax that 
restriction later if you think of a clean/easy way to do it.

like i said before: it's probably silly to do it anyway, i just didn't 
understand if/what the complication was

> Range faceting support for CurrencyField
> 
>
> Key: SOLR-3218
> URL: https://issues.apache.org/jira/browse/SOLR-3218
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: Jan Høydahl
> Fix For: 4.0
>
> Attachments: SOLR-3218-1.patch, SOLR-3218-2.patch
>
>
> Spinoff from SOLR-2202. Need to add range faceting capabilities for 
> CurrencyField

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2202) Money FieldType

2012-03-08 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225493#comment-13225493
 ] 

Hoss Man commented on SOLR-2202:


a) CurrencyField (and by extension "CurrencyValue") gets my vote

b) i really only reviewed the facet stuff in SOLR-2202-solr-10.patch (i know 
Jan has already been reviewing the more core stuff about the type) ... it makes 
me realize that we really need to refactor the range faceting code to be easier 
to do in custom FieldTypes, but that's certainly no fault of this issue and can 
be done later.

The facet code itself looks correct but my one concern is that (if i'm 
understanding all of this MoneyValue conversion stuff correctly) it _should_ be 
possible to facet with start/end/gap values specified in any currency, as long 
as they are all consistent -- but there is not test of this situation.  the 
negative test only looks at using an inconsistent gap, and the positive tests 
only use USD, or the "default" which is also USD.  We should have at least one 
test that uses something like EUR for start/end/gap and verifies that the 
counts are correct given the conversion rates used in the test.

incidentally: I don't see anything actually enforcing that start/end are in the 
same currency -- just that gap is in the same currency as the values it's being 
added to, so essentially that start and gap use hte same currenty.  But I'm 
actually not at all clear on why there is any attempt to enforce that the 
currencies used are the same, since the whole point of the type (as i 
understand it) is that you can do conversions on the fly -- it may seem silly 
for someone to say {{facet.range.start=0,USD & facet.range.gap=200,EUR & 
facet.range.end=1000,YEN}} but is there any technical reason why we can't let 
them do that?

> Money FieldType
> ---
>
> Key: SOLR-2202
> URL: https://issues.apache.org/jira/browse/SOLR-2202
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.5
>Reporter: Greg Fodor
>Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2022-solr-3.patch, SOLR-2202-lucene-1.patch, 
> SOLR-2202-solr-1.patch, SOLR-2202-solr-10.patch, SOLR-2202-solr-2.patch, 
> SOLR-2202-solr-4.patch, SOLR-2202-solr-5.patch, SOLR-2202-solr-6.patch, 
> SOLR-2202-solr-7.patch, SOLR-2202-solr-8.patch, SOLR-2202-solr-9.patch, 
> SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch
>
>
> Provides support for monetary values to Solr/Lucene with query-time currency 
> conversion. The following features are supported:
> - Point queries
> - Range quries
> - Sorting
> - Currency parsing by either currency code or symbol.
> - Symmetric & Asymmetric exchange rates. (Asymmetric exchange rates are 
> useful if there are fees associated with exchanging the currency.)
> At indexing time, money fields can be indexed in a native currency. For 
> example, if a product on an e-commerce site is listed in Euros, indexing the 
> price field as "1000,EUR" will index it appropriately. By altering the 
> currency.xml file, the sorting and querying against Solr can take into 
> account fluctuations in currency exchange rates without having to re-index 
> the documents.
> The new "money" field type is a polyfield which indexes two fields, one which 
> contains the amount of the value and another which contains the currency code 
> or symbol. The currency metadata (names, symbols, codes, and exchange rates) 
> are expected to be in an xml file which is pointed to by the field type 
> declaration in the schema.xml.
> The current patch is factored such that Money utility functions and 
> configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), 
> while the MoneyType and MoneyValueSource lie in Solr. This was meant to 
> mirror the work being done on the spacial field types.
> This patch will be getting used to power the international search 
> capabilities of the search engine at Etsy.
> Also see WIKI page: http://wiki.apache.org/solr/MoneyFieldType

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3210) 3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial

2012-03-06 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223882#comment-13223882
 ] 

Hoss Man commented on SOLR-3210:


Suggested contents for http://lucene.apache.org/solr/tutorial.html ...

{noformat}
A copy of the tutorial for each version of Solr is included 
in the documentation for that release.

Copies of the tutorial for the most recent releas of each 
major branch can also be found online:

- 3.6
- https://builds.apache.org/job/Solr-trunk/javadoc/doc-files/tutorial.html";>4x
 trunk (unreleased for developers only)
{noformat}

..and maybe at some point we can do a youtube embed of a walk through of the 
tutorial on that page as well

> 3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial
> --
>
> Key: SOLR-3210
> URL: https://issues.apache.org/jira/browse/SOLR-3210
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 3.6
>
>
> Unless we have an alternate strategy in place for dealing with versioned docs 
> by the time 3.6 is released, then as a post-release task, once the 3.6 
> javadocs are snapshoted online (ie: http://lucene.apache.org/solr/api/) the 
> current "online" copy of the tutorial 
> (http://lucene.apache.org/solr/tutorial.html) should be pruned down so that 
> it is just a link to the snapshot version released with 3.6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3854) Non-tokenized fields become tokenized when a document is deleted and added back

2012-03-06 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223402#comment-13223402
 ] 

Hoss Man commented on LUCENE-3854:
--

i tried arguing a long time ago that IndexReader.document(...) should return 
"Map" since known of the Document/Field object metdata makes 
sense at "read" time ... never got any buy in from anybody else.

> Non-tokenized fields become tokenized when a document is deleted and added 
> back
> ---
>
> Key: LUCENE-3854
> URL: https://issues.apache.org/jira/browse/LUCENE-3854
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.0
>Reporter: Benson Margulies
>
> https://github.com/bimargulies/lucene-4-update-case is a JUnit test case that 
> seems to show a problem with the current trunk. It creates a document with a 
> Field typed as StringField.TYPE_STORED and a value with a "-" in it. A 
> TermQuery can find the value, initially, since the field is not tokenized.
> Then, the case reads the Document back out through a reader. In the copy of 
> the Document that gets read out, the Field now has the tokenized bit turned 
> on. 
> Next, the case deletes and adds the Document. The 'tokenized' bit is 
> respected, so now the field gets tokenized, and the result is that the query 
> on the term with the - in it no longer works.
> So I think that the defect here is in the code that reconstructs the Document 
> when read from the index, and which turns on the tokenized bit.
> I have an ICLA on file so you can take this code from github, but if you 
> prefer I can also attach it here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3140) Make omitNorms default for all numeric field types

2012-02-29 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219389#comment-13219389
 ] 

Hoss Man commented on SOLR-3140:


bq. Is there a better place to set this default than in init() in the new base 
class?

probably not

bq. Should StrField or other fields also have omitNorms as default?

I don't think so?  if you search on a multivalued string field like "keywords" 
or "tags" it's reasonable to want length normalization to be a factor to 
prevent keyword stuffing.  

> Make omitNorms default for all numeric field types
> --
>
> Key: SOLR-3140
> URL: https://issues.apache.org/jira/browse/SOLR-3140
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: Jan Høydahl
>  Labels: omitNorms
> Fix For: 4.0
>
> Attachments: SOLR-3140.patch
>
>
> Today norms are enabled for all Solr field types by default, while in Lucene 
> norms are omitted for the numeric types.
> Propose to make the Solr defaults the same as in Lucene, so that if someone 
> occasionally wants index-side boost for a numeric field type they must say 
> omitNorms="false". This lets us simplify the example schema too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3175) simplify & add test to ensure various query "escape" functions are in sync

2012-02-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218892#comment-13218892
 ] 

Hoss Man commented on SOLR-3175:


Suggested approach:

* replace the  {{if (c == 'x' || c == 'y' || ... )}} meme with a Set 
lookup
* make the Set used in each case public static final
* add a unit test that asserts the maps are equivilent when they are suppose to 
be equivilent, or supersets when they are suppose to be supersets.

> simplify & add test to ensure various query "escape" functions are in sync
> --
>
> Key: SOLR-3175
> URL: https://issues.apache.org/jira/browse/SOLR-3175
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> We have three query syntax escape related functions (that i know) of that 
> can't be refactored...
> * QueryParser.escape
> ** canonical
> * ClientUtils.escapeQueryChars
> ** part of solrj, doesn't depend directly on QueryParser so that Solr clients 
> on't need the query parser jar locally
> * SolrPluginUtils.partialEscape
> ** designed to be a negative subset of the full set (ie: all chars except 
> +/-/")
> ...we should figure out a way to assert in our tests that these are all in 
> agreement (or at least as much as they are ment to be) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser

2012-02-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218845#comment-13218845
 ] 

Hoss Man commented on SOLR-2368:


bq. Are you saying it would be possible to define something like this in 
solrconfig.xml

...something like that would certainly be possible if we changed the QParsers 
to start doing interesting things with their init params (presumably defaults 
there would be the lowest possible level defaults, overridden by things like 
request handler defaults/invariants/appends ... and it would really make sense 
to allow invariants/appends in the qparser init).

but that would only really help with the backcompat "locked down" dismax 
situation i'm concerned with if we made sure all of those init params were also 
used in the implicitly created instance of "dismax"

what i had in mind was actually far simpler...

* "dismax" is implicitly an instance of DismaxQParserPlugin (unless overridden 
in solrconfig.xml) .. just like today
* "edismax" is implicitly an instance of ExtendedDismaxQParserPlugin (unless 
overridden in solrconfig.xml) ... just like today
* ExtendedDismaxQParserPlugin works exactly as it does today but instead of all 
the hardcoded default param values sprinkled around ExtendedDismaxQParser we 
put them all in a static Map or SolrParams instance and add a constructor arg 
to override those defaults
* DismaxQParserPlugin gets changed to look something like...
{code}
final static SolrParams REALLY_LIMITED_DEFAULTS = new SolrParams("uf", "-*", 
...);
public QParser createParser(String qstr, SolrParams localParams, SolrParams 
params, SolrQueryRequest req) {
  return new ExtendedDisMaxQParser(qstr, localParams, params, req, 
REALLY_LIMITED_DEFAULTS);
}
{code}
...using that new ExtendedDisMaxQParser constructor arg to override the 
defaults * DisMaxQParser.java gets svn removed because it's no longer needed.

...all of which could then be enhanced with init based overrides of those 
defaults like you suggested.

> Improve extended dismax (edismax) parser
> 
>
> Key: SOLR-2368
> URL: https://issues.apache.org/jira/browse/SOLR-2368
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Yonik Seeley
>  Labels: QueryParser
>
> This is a "mother" issue to track further improvements for eDismax parser.
> The goal is to be able to deprecate and remove the old dismax once edismax 
> satisfies all usecases of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3157) custom logging

2012-02-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218816#comment-13218816
 ] 

Hoss Man commented on SOLR-3157:


{quote}
IMO, it's actually higher importance that we have better logging for ourselves 
so we can more easily debug our tests.
{quote}

agreed ... but we shouldn't break the log format of the most important line 
Solr logs if we can avoid it (particularly since the format was chosen 
specifically to be easy to parse).

i also don't understand why you think those changes to SolrCore.execute's 
"info" log make anything better? your change _removed_ really useful info and 
made the messages harder to parse because it removed the consistent key=val 
pattern for the path & param vals.

{quote}
Maybe there's a way I can log different things when using the test formatter 
and restore the production log format to it's former glory.
{quote}

I don't understand why the structure of the _string_ used in this one info 
message has anything to do with your goal of better test logging using the 
SolrLogFormatter?  why can't that _string_ format stay exactly identical 
regardless of whether it's in a test (and the _record_ gets formated with all 
of your new thread and meatdata goodness) or not ?

as things stand now (with your latest commit #1294911) these messages are still 
broken compared to 3x because they don't include the SolrCore name (the "logid" 
used to init the StringBuilder before you changed it) ... that seems pretty 
damn important (although i realize now i somehow didn't mention that in my 
earlier comment)

{noformat}
3x example...

Feb 28, 2012 5:38:33 PM org.apache.solr.core.SolrCore execute
INFO: [core0] webapp=/solr path=/select/ params={q=*:*&sort=score+desc} hits=0 
status=0 QTime=11 

trunk example...

Feb 28, 2012 5:39:02 PM org.apache.solr.core.SolrCore execute
INFO: webapp=/solr path=/select/ params={q=*:*&sort=score+desc} hits=0 status=0 
QTime=40 
{noformat}

> custom logging
> --
>
> Key: SOLR-3157
> URL: https://issues.apache.org/jira/browse/SOLR-3157
> Project: Solr
>  Issue Type: Test
>Reporter: Yonik Seeley
>Priority: Blocker
> Attachments: SOLR-3157.patch, jetty_threadgroup.patch
>
>
> We need custom logging to decipher tests with multiple core containers, 
> cores, in a single JVM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3157) custom logging

2012-02-28 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218398#comment-13218398
 ] 

Hoss Man commented on SOLR-3157:


Yonik: your changes in r1294212 break the SolrCore.execute log format 
conventions we've had in place since SOLR-267, which breaks some log processing 
code i have (and since the whole point of SOLR-267 was to make it easy for 
people to write log parses, i'm guessing i'm not the only one)

Notably:
 * you changed the "path" and "params" key=val pairs so they no longer include 
the key, just the val -- this doesn't really make sense to me since the whole 
point of those log lines is that they are suppose to be consistent and easy to 
parse.
 * you removed the webapp key=val pair completely (comment says "multiple 
webaps are no longer best practise" but that doesn't mean people don't use them 
or that we should just suddenly stop logging them.

> custom logging
> --
>
> Key: SOLR-3157
> URL: https://issues.apache.org/jira/browse/SOLR-3157
> Project: Solr
>  Issue Type: Test
>Reporter: Yonik Seeley
>Priority: Blocker
> Attachments: SOLR-3157.patch, jetty_threadgroup.patch
>
>
> We need custom logging to decipher tests with multiple core containers, 
> cores, in a single JVM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3156) Check for locks on startup

2012-02-27 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217934#comment-13217934
 ] 

Hoss Man commented on SOLR-3156:


Luca: the change to SolrCore looks good to me ... the one thing i might suggest 
is adding an ERROR log just before you throw the exception (i'm in the "log 
early" team)

The test looks awesome, but *PLEASE* trim those solrconfig files down so that 
they only contain the 5-6 minimum lines they need inorder for the test to be 
meaningful ... we have far too many big bloated test configs already, the goal 
is to stop adding new ones and make sure each test config has a specific and 
easy to see purpose.

> Check for locks on startup
> --
>
> Key: SOLR-3156
> URL: https://issues.apache.org/jira/browse/SOLR-3156
> Project: Solr
>  Issue Type: Improvement
>Reporter: Luca Cavanna
> Attachments: SOLR-3156.patch
>
>
> When using simple or native lockType and the application server is not 
> shutdown properly (kill -9), you don't notice problems until someone tries to 
> add or delete a document. In fact, you get errors every time Solr opens a new 
> IndexWriter on the "locked" index. I'm aware of the unlockOnStartup option, 
> but I'd prefer to know and act properly when there's a lock, and I think it 
> would be better to know on startup, since Solr is not going to work properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3141) Deprecate OPTIMIZE command in Solr

2012-02-27 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217779#comment-13217779
 ] 

Hoss Man commented on SOLR-3141:


I don't have the energy to really get in depth with all of the discussion 
that's taken place so far, i'll try to keep my comments brief:

0) i'm a fan of the patch currently attached.

1) i largely agree with most of yonik's points -- this is a documentation 
problem first and foremost.  Saying that all people who optimize are wrong is 
ridiculous, and breaking something that has use and value for a set of people 
just because some other set of people are using it foolishly seems really 
absurd.

2) changing the "optimize" command to be a no-op with a warning logged, or a 
failure, where the documented "fix" to regain old behavior for people who 
genuinely need it is to search & replace the string "optimize" with some new 
string "forceMerge" seems uterly absurd to me.  this is not the first time 
we've had a param name that people later regretted giving the name that we did 
-- are we going to change _all_ of them for 4.0?  Unlike a method renamed in 
java code where it's easy to see how the change affects you because of 
compilation failures, this kind of HTTP param change is a serious pain in the 
ass for people with client apps written using multiple languages/libraries ... 
naming consistency for existing users seems far more important then having 
perfect names.

3) Even if the goal is to force people to evaluate whether they really want to 
merge down to one segment, we have to consider how hard we make things for 
people when the answer is "yes".  If someone is using a client library/app to 
talk to Solr it may not be easy/simple/possible for them to replace "optimize" 
with "forceMerge" or something like it w/o mucking in the internals of that 
library -- there's no reason to piss off users like that.

4) any discussion about renaming/removing "optimize" in the Solr HTTP APIs 
should really consider how that will impact a few other user visible things...

* {{}} hooks in solrconfig and the 
corisponding SolrEventListener.postOpimize method
* SolrDeletionPolicy has options related to how many optimized indexes to keep
* spellchecker has options relating to building on optimize (although if i 
remember correctly there is a bug about this being broken so it can probably 
die no problem)

5) Assuming that too many people optimize when the shouldn't, either out of 
ignorance or because their tools do it out of ignorance and we want to help 
minimize that moving forward; and given my opinion that renaming "optimize" 
will only hurt people w/o actually helping the root problem -- here's my straw 
man proposal to try and improve the situation (similar to what jan suggested 
but taking into account that we already support a "maxSegments" option when 
doing optimize commands) ...

* commit the attached patch as is (it's just plain a good idea, regardless of 
anything else we might do)
* change CommitUpdateCommand.maxOptimizeSegments so it defaults to "-1" and 
document that when the value is less then 0 it means the UpdateHandler 
configuration determines the value.
* add a new {{}} config option to 
{{}} - make the UpdateHandler use that value anytime 
CommitUpdateCommand.maxOptimizeSegments is less then 0, and for backcompat have 
it default to "1" if not specified.
* update the example configs to include 
{{999}} with a comment 
warning against hte evils of over-optimization
* change the code in Solr which deals with {{}} formated 
instructions so that any SolrParams in the request with names the same as xml 
attributes override the attributes -- ie: {{POST /update?maxSegments=4 with 
data: }} should result in a CommitUpdateCommand 
with maxOptimizeSegments=4

The end result being:
* new users who start with new configs have an UpdateHandler that is going 
effectively ignore "optimize" commands that don't specify a "maxSegments"
* nothing breaks for existing users
* existing users who only want to allow optimize commands when "maxSegments" is 
specified can cut/paste that oneline {{}} config
* new and existing users who want Solr to ignore all optimize commands, even 
when they do have a "maxSegments", can configure an invariant 
maxSegments=999 param on the affected request handlers




> Deprecate OPTIMIZE command in Solr
> --
>
> Key: SOLR-3141
> URL: https://issues.apache.org/jira/browse/SOLR-3141
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 3.5
>Reporter: Jan Høydahl
>  Labels: force, optimize
> Fix For: 3.6
>
> Attachments: SOLR-3141.patch, SOLR-3141.patch
>
>
> Background: LUCENE-3454 renames optimize() as forceMerge(). Please read that 
> issue first.
> Now th

[jira] [Commented] (SOLR-3161) Use of 'qt' should be restricted to searching and should not start with a '/'

2012-02-27 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217382#comment-13217382
 ] 

Hoss Man commented on SOLR-3161:


-0

1) there are plenty of people who are happily using "qt" to dynamicly pick 
their request handler who don't care about securing their solr instances -- we 
shouldn't break things for them if we can avoid it.

2) assuming qt should be allowed only if it is an instance of 
solr.SearchHandler seems narrow minded to me -- it puts a totally arbitrary 
limitation on the ability for people to have their own request handlers that 
are treated as "first class citizens" and seems just as likely to lead to 
suprise and frustration as it is to appreciation for the "safety" of the 
feature (not to mention it procludes perfectly safe "query" type handlers like 
MLTHnadler and AnalysisRequestHandler


if he root goal is "make solr safer for people who don't want/expect "qt" based 
requests then unless i'm overlooking something it seems like there is a far 
simpler and more straightforward solution...

a) change the example solrocnfig to use handleSelect="false"
b) remove the (long ago deprecated) SolrServlet

if handleSelect == false, then the request dispatcher won't look at "/select" 
requests at all (unless someone has a handler named "/select") and it would do 
dispatching based on the "qt" param.  currently if that's false the logic falls 
throough to the SolrServlet, but if that's been removed then the request will 
just fail.

So new users who copy the example will have only path based request handlers by 
default, and will have to go out of their way to set handleSelect=true to get 
qt based dispatching.

Bonus points: someone can write a DispatchingRequestHandler that can optionally 
be configured with some name (such as "/select") and does nothing put look for 
a "qt" param and forward to the handler with that name -- but it can have 
configuration options indicating which names are permitted (and any other names 
would be rejected)

...on the whole, compared to the original suggestion in this issue, that seems 
a lot safer for people who want safety, and a lot simpler to document.

comments? 


> Use of 'qt' should be restricted to searching and should not start with a '/'
> -
>
> Key: SOLR-3161
> URL: https://issues.apache.org/jira/browse/SOLR-3161
> Project: Solr
>  Issue Type: Improvement
>  Components: search, web gui
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 3.6, 4.0
>
>
> I haven't yet looked at the code involved for suggestions here; I'm speaking 
> based on how I think things should work and not work, based on intuitiveness 
> and security. In general I feel it is best practice to use '/' leading 
> request handler names and not use "qt", but I don't hate it enough when used 
> in limited (search-only) circumstances to propose its demise. But if someone 
> proposes its deprecation that then I am +1 for that.
> Here is my proposal:
> Solr should error if the parameter "qt" is supplied with a leading '/'. 
> (trunk only)
> Solr should only honor "qt" if the target request handler extends 
> solr.SearchHandler.
> The new admin UI should only use 'qt' when it has to. For the query screen, 
> it could present a little pop-up menu of handlers to choose from, including 
> "/select?qt=mycustom" for handlers that aren't named with a leading '/'. This 
> choice should be positioned at the top.
> And before I forget, me or someone should investigate if there are any 
> similar security problems with the shards.qt parameter. Perhaps shards.qt can 
> abide by the same rules outlined above.
> Does anyone foresee any problems with this proposal?
> On a related subject, I think the notion of a default request handler is bad 
> - the default="true" thing. Honestly I'm not sure what it does, since I 
> noticed Solr trunk redirects '/solr/' to the new admin UI at '/solr/#/'. 
> Assuming it doesn't do anything useful anymore, I think it would be clearer 
> to use  instead of 
> what's there now. The delta is to put the leading '/' on this request handler 
> name, and remove the "default" attribute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3804) Swap Features and News on the website.

2012-02-21 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212866#comment-13212866
 ] 

Hoss Man commented on LUCENE-3804:
--

There's a couple of problems here we need to address...

1) features are listed on /solr/index.html, but there is also a right nav link 
to /solr/features.html
2) duplicate content on both /solr/features.html and /solr/index.html that will 
only increase that confusion
3) "Title" metadata from features.mdtext appearing the boxy of /solr/index.html
4) #1 & #2 both affect the /core/... urls as well (that features.mdtext 
evidently doens't use the "Title" attribute tough)

The fixes i would suggest are...

a) add a redirect for /solr/features.html -> /solr/ (and likewise for core)
b) remove "Features" from the right nav
c) either remove the "Title" metadata from pages being included, or stop doing 
this as an include and put the content direclty in index.mdtext -- i would 
suggest the later since it will make it more straight forward for editing in 
the future.

> Swap Features and News on the website.
> --
>
> Key: LUCENE-3804
> URL: https://issues.apache.org/jira/browse/LUCENE-3804
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Mark Miller
>Assignee: Mark Miller
>
> I think we can do even better, but that is a nice, easy incremental 
> improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3142) remove O(n^2) slow slow indexing defaults in DataImportHandler

2012-02-21 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212761#comment-13212761
 ] 

Hoss Man commented on SOLR-3142:


agreed, was just noting why i _think_ the original default was true..

> remove O(n^2) slow slow indexing defaults in DataImportHandler
> --
>
> Key: SOLR-3142
> URL: https://issues.apache.org/jira/browse/SOLR-3142
> Project: Solr
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Robert Muir
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3142.patch
>
>
> By default, dataimporthandler optimizes the entire index when it commits.
> This is bad for performance, because it means by default its doing a very
> heavy index-wide operation even for an incremental update... essentially 
> O(n^2) indexing.
> All that is needed is to set optimize=false by default. If someone wants
> to optimize, they can either set optimize=true or explicitly optimize 
> themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3142) remove O(n^2) slow slow indexing defaults in DataImportHandler

2012-02-21 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212752#comment-13212752
 ] 

Hoss Man commented on SOLR-3142:


FWIW: I'm pretty sure the original assumption here was that in the (relatively 
common) usecase of doing a full-import rebuild on a regular basis (ie: nightly) 
that it can be handy to have it auto-optimize when you are done.   I think the 
real problem is that that assumption was never challeneged regarding things 
like delta import.

so an argument could be made the the default should still be to optimze=true on 
full-import, and optimize=false on delta import ... but i'm not going to make 
that argument, i think this it's silly to assume true in either case.  
(particularly since a parameterized full import might actually be a rapidly 
repeating incremental)

> remove O(n^2) slow slow indexing defaults in DataImportHandler
> --
>
> Key: SOLR-3142
> URL: https://issues.apache.org/jira/browse/SOLR-3142
> Project: Solr
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Robert Muir
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3142.patch
>
>
> By default, dataimporthandler optimizes the entire index when it commits.
> This is bad for performance, because it means by default its doing a very
> heavy index-wide operation even for an incremental update... essentially 
> O(n^2) indexing.
> All that is needed is to set optimize=false by default. If someone wants
> to optimize, they can either set optimize=true or explicitly optimize 
> themselves.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209630#comment-13209630
 ] 

Hoss Man commented on LUCENE-3792:
--

StrawMan suggestion off the top of my head:

* rename NOT_ANALYZED to something like KEYWORD_ANALYZED
* document KEYWORD_ANALYZED as being a convenience flag (and/or optimization?) 
for achieving equivalent behavior as using PerFieldAnalyzer with 
KeywordAnalyzer for this field, and keep using / re-word rmuir's patch warning 
to make it clear that if you use this at index time, any attempts to construct 
queries against it using the QueryParser will need KeywordAnalyzer.

...would that flag name == analyzer name equivalence help people remember not 
to get trapped by this?

> Remove StringField
> --
>
> Key: LUCENE-3792
> URL: https://issues.apache.org/jira/browse/LUCENE-3792
> Project: Lucene - Java
>  Issue Type: Task
>Affects Versions: 4.0
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3792_javadocs_3x.patch, 
> LUCENE-3792_javadocs_3x.patch
>
>
> Often on the mailing list there is confusion about NOT_ANALYZED.
> Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
> this
> not being consistent at query time (you really need to configure 
> KeywordAnalyzer for the field 
> on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
> oh wait
> once you've done that, you dont need NOT_ANALYZED).
> So I think StringField is a trap too for the same reasons, just under a 
> different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3005) Content-Type disappear

2012-02-13 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207236#comment-13207236
 ] 

Hoss Man commented on SOLR-3005:


Chris: +1, commit.

> Content-Type disappear
> --
>
> Key: SOLR-3005
> URL: https://issues.apache.org/jira/browse/SOLR-3005
> Project: Solr
>  Issue Type: Bug
>  Components: Response Writers
>Affects Versions: 3.5
> Environment: Solr 3.5.0
>Reporter: Gasol Wu
>Assignee: Chris Male
> Attachments: SOLR-3005.patch, SOLR-3005.patch
>
>
> i expect that query always return Content-Type, but after SOLR-1123 had 
> committed, it got chance to return nothing if you leave out all of 
> queryResponseWriter in solrconfig.xml. however i attach a small patch that 
> will correct this situation.
> It look like DEFAULT_RESPONSE_WRITERS never called init method in 
> org.apache.solr.core.SolrCore

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS

2012-02-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205186#comment-13205186
 ] 

Hoss Man commented on LUCENE-3750:
--

bq. I'll contact you on IRC and we can work through it.

I realize now i didn't not finish explaining the concerns i was trying to get 
to in that first para.

my comment was not intended to mean "i can't get local doc building to work for 
the www site, therefore i'm leary of using local doc building for the versioned 
docs".  my comment was to be "if 1 out of N committers who have tried doing 
local www site builds can't get it to work, then that doesn't bode well for 
asking non-committer contributors to be able to use the same markdown tools to 
do local doc building of versioned docs that they might want to contribute 
patches for."

that smells like a big red flag to me.

> Convert Versioned docs to Markdown/New CMS
> --
>
> Key: LUCENE-3750
> URL: https://issues.apache.org/jira/browse/LUCENE-3750
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Priority: Minor
>
> Since we are moving our main site to the ASF CMS (LUCENE-2748), we should 
> bring in any new versioned Lucene docs into the same format so that we don't 
> have to deal w/ Forrest anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3076) Solr should support block joins

2012-02-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205156#comment-13205156
 ] 

Hoss Man commented on SOLR-3076:


bq. Maybe there can be field aliases? Eg, book_page_count:[0 to 1000] and 
chapter_page_count[10:40], and the QP is told to map book_page_count -> 
parent:size and chapter_page_count -> child:size? Or maybe we let the user 
explicitly scope the field, eg chapter:size, book:size, book:title, etc. Not 
sure...

Hmmm... i kind of understand what you're saying; but the part i'm not 
understanding is even if you had field aliasing like that, given some query 
string like... 
{code}
  book_page_count:[0 TO 1000] and chapter_page_count[10 TO 40]
{code}
..how would the parser know whether the user was asking for the results to be 
"book documents" matching that criteria (1-1000 pages and containing at least 
one chapter child containing 10-40 pages), or "chapter documents" matching that 
criteria (10-40 pages contained in a book of 1-1000 pages) or "page documents" 
(all pages in containing in a chapter of 10-40 total pages, contained in a book 
of 1-1000 total pages) ?

I mean: it seems possible, and a QParser like that could totally support 
configuring those types of file mappings / hierarchy definitions in init 
params, but perhaps we should focus on the more user explicit, direct mapping 
type QParser type approach Mikhail has already started on for now, and consider 
that as an enhancement later?  (especially since it's not clear how the 
indexing side will be managed/enforced -- depending on how that shapes up, it 
might be possible for a QParser like you're describing, or perhaps _all_ 
QParsers to infer the field rules from the schema or some other configuration)

I think the syntax in Mikhail's BlockJoinParentQParserPlugin looks great as a 
straight forward baseline implementation.  The one straw man suggestion i might 
toss out there for consideration would be to invert the use of the "filter" and 
"v" local params, so instead of...

{code}
{!parent filter="parent:true"}child_name:b
{!parent filter="parent:true"}
{code}

...it might be...

{code}
{!parent of="child_names:b"}parent:true
{!parent}parent:true
{code}

...people may find that easier to read as a way to understand that the final 
query will return "parent documents" constraint such that those parent 
documents have children matching the "of" query.  The one thing i don't like 
this "of" idea is that (compared to the "filter" param Mikhail uses) it might 
be more tempting for people to use something like...

{code}
// WRONG! (i think)
q={!parent of="child_names:b"}some_parent_field:foo
{code}

...when they mean to write something like this...

{code}
q={!parent of="child_names:b"}some_query_that_identifies_the_set_of_all_parents
fq=some_parent_field:foo
{code}

...because as i understand it, it's important for the "parentFilter" to 
identify *all* of the parent documents, even ones you may not want returned, so 
that the ToParentBlockJoinQuery knows how to identify the parent of each 
document (correct?)

This type of user confusion is still possible with the syntax Mikhail's got, 
but i suspect it will be less likely --- In any case, i wanted to put the idea 
out there.

Given McCandless supposition that the parent/child relationships are likely to 
be very consistent, not very deep, and not vary from query to query, one thing 
we could do to to help mitigate this possible confusion would be:
 * make the "filter" param name much longer and verbose, ie: 
{{setOfAllParentsQuery}}
 * make the param optional, and have it default to something specified as an 
init param, ie: {{defaultSetOfAllParentsQuery}}
 * make the init param mandatory

That way, in the common case people will configure things like...

{code}

  type:parent

{code}

..and their queries will be simple...

{code}
q={!parent}  (all parent docs)
q={!parent}foo:bar   (all parent docss that contain kid docs matching 
foo:bar)
{code}

...but it will still be possible for people with more complex usecases with do 
more complex things.

Mikhail: some other minor feedback on the parts i understood of your patch that 
i understood (note: my lack of understanding is not a fault of your patch, it's 
just that most of the block join stuff is very foreign to me)...

* please prune down "solrconfig-bjqparser.xml" so it contains only the absolute 
minimum things you need for the test case, it makes it a lot easier for people 
to review the patch, and for users to understand what is necessary to utilize 
features demoed in the test (we have a lot of old bloaded solrconfig files i 
nthe test dir, but we're trying to stop doing that)
* the test would be a bit easier to follow if you used different letters for 
the parent fields vs the child fields (abcdef, vs xyz for example)
* it would be good to have tests verifying that nested parent queries 

[jira] [Commented] (SOLR-2802) Toolkit of UpdateProcessors for modifying document values

2012-02-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205114#comment-13205114
 ] 

Hoss Man commented on SOLR-2802:


bq. In my opinion, I think if a user asks for min or max or some other 
computation, and this is not possible, it should return an error? otherwise why 
did they configure this in their chain? 

Agreed, i'm not sure what usecase i had in mind when i wrote min/max to "pass 
through" in this situation, but failing hard is definitely better -- at least 
by default.  If someone comes up with a reason not to fail, we can always add 
an option later.

I've committed this change in r1242625.

bq. I think min/max should not extend this type-unsafe Subset base, as they 
should not return a subset anyway, but a singleton, and the input must be 
comparable... 

If you'd like to take a stab at refactoring by all means be me guest.  It's 
true, these instances don't need to return a subset, but even if we change them 
to not subclass that particular base class, I don't see any simple way to 
rewrite them such that they only accept a Collection.  
UpdateProcessors deal with SolrInputDocuments & SolrInputFields that are just 
bags of objects; the schema hasn't been consulted yet, so we don't have any 
hard type information about the types of these Objects (and even if we could we 
wouldn't want to consult the schema yet, because some of these "fields" might 
be for input purposes only -- some UpdateProcessor down the pipe might be 
copying/moving them to different fields).

So if you want these Min/Max processors to have APIs that strictly enforce 
Collection>, then some code somewhere needs to check that and 
cast appropriately -- at the moment, they delegate that responsibility to 
Collections.min and Collections.max, because that class does that check anyway 
as it dos it's computation.  

Personally i think the current impl is better anyway because in the common case 
of clients sending "clean data" we don't waste cycles checking the type of 
every Object sent before asking Collections.class to find the min/max and doing 
the check again anyway.  if an exceptional case happens, we catch/log/wrap the 
exception.

> Toolkit of UpdateProcessors for modifying document values
> -
>
> Key: SOLR-2802
> URL: https://issues.apache.org/jira/browse/SOLR-2802
> Project: Solr
>  Issue Type: New Feature
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 4.0
>
> Attachments: SOLR-2802_update_processor_toolkit.patch, 
> SOLR-2802_update_processor_toolkit.patch, 
> SOLR-2802_update_processor_toolkit.patch, 
> SOLR-2802_update_processor_toolkit.patch, 
> SOLR-2802_update_processor_toolkit.patch
>
>
> Frequently users ask about questions about things where the answer is "you 
> could do it with an UpdateProcessor" but the number of our of hte box 
> UpdateProcessors is generally lacking and there aren't even very good base 
> classes for the common case of manipulating field values when adding documents

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3076) Solr should support block joins

2012-02-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204868#comment-13204868
 ] 

Hoss Man commented on SOLR-3076:


bq. Maybe instead of opening up end-user QP syntax to control block joins, 
there should be configuration that tells the query parser how to create the 
parent bits filter, which fields are in "child scope" vs "parent scope", etc.? 

wouldn't that still need to be a query time option to be useful to the full 
capability of block join??

from what i remember about block join:
* you can have arbitrary depth of parent>child->grandchild->etc...
* there's nothing prevent parents and children having the same fields

...correct?

so wouldn't it be kind of limiting if those types of options were configuration 
that couldn't be done per query/filter?  (ie: in this fq i want only docs whose 
parents are size:[0 TO 1000] but in this other fq i want docs who are 
themselves size:[10 TO 40] ... if perhaps "parents" are books and the children 
being queried are "chapters" and "size" is number of pages)


> Solr should support block joins
> ---
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
> Attachments: SOLR-3076.patch, bjq-vs-filters-backward-disi.patch, 
> bjq-vs-filters-illegal-state.patch, parent-bjq-qparser.patch, 
> parent-bjq-qparser.patch, solrconf-bjq-erschema-snippet.xml
>
>
> Lucene has the ability to do block joins, we should add it to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS

2012-02-09 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204827#comment-13204827
 ] 

Hoss Man commented on LUCENE-3750:
--

FWIW: while i love the ability to do quick edits using the CMS bookmarklet, i 
have had absolutely zero success doing a full site build on my local machine -- 
i'm not even getting errors, i'm just getting no output.

this may just be something really flakey about my machine, but it begs the 
question: for our _versioned_ per release documentation, is there really any 
advantage to using markdown over simply editing HTML directly?

we used forrest for the versioned docs solely because once upon a time the 
entire site use to be versioned, and it was easy to just extract the versioned 
docs from the non-versioned docs and keep using forrest for both and have a 
similar look/feel, but is there really any reason why the versioned docs we 
ship in the release need to be in markdown and/or have the same look/feel as 
the website?  yes, we'll archive them on the website for refrence, but we also 
archive the javadocs and we've never tried to make them look like the forrest 
docs (or the new site style)

why not just leave the versioned docs looking very simple, and very distinctive 
from the website, so when people browsing them online see them it's very 
obvious that they are *not* just part of the website, they are a snapshot of 
per version documentation?



> Convert Versioned docs to Markdown/New CMS
> --
>
> Key: LUCENE-3750
> URL: https://issues.apache.org/jira/browse/LUCENE-3750
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Priority: Minor
>
> Since we are moving our main site to the ASF CMS (LUCENE-2748), we should 
> bring in any new versioned Lucene docs into the same format so that we don't 
> have to deal w/ Forrest anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3105) Add analysis configurations for different languages to the example

2012-02-07 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203115#comment-13203115
 ] 

Hoss Man commented on SOLR-3105:


I'm with robert ... this issue is about coming up with good example configs for 
as many languages as we can.  at the moment we have one big fat kitchen-sink 
set of example configs, so lets use what we've got.

If people care strongly, we can track cleaning up and re-organizing the 
examples (to use xinclude, or add multiple more specifically targed sets of 
example configs, etc...) in a different issue.




> Add analysis configurations for different languages to the example
> --
>
> Key: SOLR-3105
> URL: https://issues.apache.org/jira/browse/SOLR-3105
> Project: Solr
>  Issue Type: Improvement
>Reporter: Robert Muir
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3105.patch
>
>
> I think we should have good baseline configurations for our supported 
> analyzers
> so that its easy for people to get started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3095) update processor chain should check for "enable" attribute on all processors

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200268#comment-13200268
 ] 

Hoss Man commented on SOLR-3095:


I started trying to hack together a patch for this and then realized: i'm 
pretty sure this already works because of how PluginInfo.loadSubPlugins works 
(it ignores "children" that is false==isEnabled())

so all we need is a test case to verify and future proof.

> update processor chain should check for "enable" attribute on all processors
> 
>
> Key: SOLR-3095
> URL: https://issues.apache.org/jira/browse/SOLR-3095
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> many types of plugins in Solr allow you to specify an "enabled" boolean when 
> configuring them, so you can use system properties in the configuration file 
> to determine at run time if they are actually used -- we should add low level 
> support for this type of setting on the individual processor declarations in 
> the UpdateRequestProcessorChain as well, so individual update processor 
> factories don't have to deal with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3017) Allow edismax stopword filter factory implementation to be specified

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200264#comment-13200264
 ] 

Hoss Man commented on SOLR-3017:


bq. Nope. I've just committed this change in trunk. There wasn't a good reason 
to use a more specific type (and it was not used anywhere).

FWIW: I'm pretty sure the only reason any of these factories are declared to 
return specific types (instead of just TokenStream) was SOLR-396 -- which isn't 
really that important now that lucene & solr development is in a single repo 
and people can easily commit factories at the same time that new impls are 
added.

> Allow edismax stopword filter factory implementation to be specified
> 
>
> Key: SOLR-3017
> URL: https://issues.apache.org/jira/browse/SOLR-3017
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Michael Dodsworth
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3017-without-guava-alternative.patch, 
> SOLR-3017.patch, SOLR-3017.patch, edismax_stop_filter_factory.patch
>
>
> Currently, the edismax query parser assumes that stopword filtering is being 
> done by StopFilter: the removal of the stop filter is performed by looking 
> for an instance of 'StopFilterFactory' (hard-coded) within the associated 
> field's analysis chain.
> We'd like to be able to use our own stop filters whilst keeping the edismax 
> stopword removal goodness. The supplied patch allows the stopword filter 
> factory class to be supplied as a param, "stopwordFilterClassName". If no 
> value is given, the default (StopFilterFactory) is used.
> Another option I looked into was to extend StopFilterFactory to create our 
> own filter. Unfortunately, StopFilterFactory's 'create' method returns 
> StopFilter, not TokenStream. StopFilter is also final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200130#comment-13200130
 ] 

Hoss Man commented on SOLR-3047:


bq. Hoss, is there a way I can send you the example privately?

[I'd rather not|https://people.apache.org/~hossman/#private_q]

if you can't share the configs you are using, can't you at least add a quick 
example of something demonstrating your problem to the example schemx.xml and 
post that?

I just tried this example from Solr 3.5.0 (alphaNameSort uses KeywordTokenizer) 
and got exactly what i expected...

{code}
http://localhost:8983/solr/select?debugQuery=true&defType=dismax&qf=name&pf=alphaNameSort&q=foo%20bar%20baz

+((DisjunctionMaxQuery((name:foo)) 
   DisjunctionMaxQuery((name:bar)) 
   DisjunctionMaxQuery((name:baz))
  )~3
 ) 
 DisjunctionMaxQuery((alphaNameSort:foobarbaz))
{code}

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses 
> KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
>  Issue Type: Bug
>Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It 
> drops it without warning - IMO it should error out if the field isn't 
> compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, 
> so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3028) Support for additional query operators (feature parity request)

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200119#comment-13200119
 ] 

Hoss Man commented on SOLR-3028:


#1) maybe i'm missundertanding SOLR-2866 ... it talks about synonyms, but the 
crux of it is really indexing multiple variants of a stemmed word with 
informatino about wether it is a stem or not, and then being able to query on 
both -- your requrest seems to heavily overlap with that -- in Victor's case he 
may be using a dictionary based stemmer, and in your case you may want a 
hueristic stemmer, but the underlying plumbing should probably all be the same.

#2) sorry, yeah i missed your label and only looked at the example.  quorom 
search is definitely possible using the dismax parse with the mm param, but 
there is no explicit syntax for it in any parser i know of at the moment.

#3) the curly braces in that example were just me being explicit about which 
parser was in use via local params -- that's not the query syntax.  you could 
just as easily do...

{code}
defType=surround&q=(this W that) AND (other W next)
{code}

In generallymy suggestion for moving forward would be to break these individual 
requests out into 3 distinct issues since they are largely unrelated (or only 
open two issues and ask about #1 in SOLR-2866 .. make an offshoot issue as 
needed)

individual issues with more direct issue summaries are easier to track and more 
likely to encourage patches from people who see the summaries and realize it's 
something they are interested in.

> Support for additional query operators (feature parity request)
> ---
>
> Key: SOLR-3028
> URL: https://issues.apache.org/jira/browse/SOLR-3028
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 4.0
>Reporter: Mike
>  Labels: operator, queryparser
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> I'm migrating my system from Sphinx Search, and there are a couple of 
> operators that are not available to Solr, which are available in Sphinx. 
> I would love to see the following added to the Dismax parser:
> 1. Exact match. This might be tricky to get right, since it requires work on 
> the index side as well[1], but in Sphinx, you can do a query such as [ 
> =running walking ], and running will have stemming off, while walking will 
> have it on. 
> 2. Term quorum. In Sphinx and some commercial search engines (like Recommind, 
> Westlaw and Lexis), you can do a search such as [ (cat dog goat)/15 ], and 
> find the three words within 15 terms of each other. I think this is possible 
> in the backend via the span query, but there's no front end option for it, so 
> it's quite hard to reveal to users.
> 3. Word order. Being able to say, "this term before that one, and this other 
> term before the next" is something else in Sphinx that span queries support, 
> but is missing in the query parser. Would be great to get this in too.
> These seem like the three biggest missing operators in Solr to me. I would 
> love to help move these forward if there is any way I can help.
> [1] At least, *I* think it does. There's some discussion of one way of doing 
> exact match like support in SOLR-2866.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200102#comment-13200102
 ] 

Hoss Man commented on SOLR-2649:


bq. Counting multiple terms as 1 because they are in parenthesis together 
doesn't seem like a good idea to me.

I disagree, but it definitely just seems like a matter of opinion -- i don't 
know that we could ever come up with something that makes sense in all use cases

personally i think the sanest change would be to say that "mm" applies to all 
top level SHOULD clauses in the query (regardless of wether they have an 
explicit OR or not) -- exactly as it always has in dismax.  if a top level 
clause is a nested boolean queries, then "mm" shouldn't apply to those because 
it doesn't make sense to blur the "count" of how many SHOULD clauses there are 
at the various levels.

would would mm=5 mean for a query like "q=X AND Y (a b) (c d) (e f) (g h)" if 
you looked at all the nested subqueries?  that only 5 of those 8 (lowercase) 
leaf level clauses are required?  how would that be implemented on the 
underlying BooleanQuery objects w/o completely flattening the query (which 
would break the intent of the user when they grouped them) ... it seems like 
mm=5 (or mm=100%) should mean 5 (or 100%) of the top level SHOULD clauses are 
required ... the default query op should determine how any top level clauses 
that are BooleanQueries are dealt with.

...but that's just my opinion.  



> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.3
>Reporter: Magnus Bergmark
>Priority: Minor
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The behavior seems to be intentional, although the reason why is never 
> explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from 
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the 
> primary features of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200079#comment-13200079
 ] 

Hoss Man commented on SOLR-2368:


bq. If/when eDismax can be configured to fill the original role of DisMax, why 
should we maintain the old one?

my chief concerns -- as i mentioned -- are that _currently_ edismax has 
behavior dismax doesn't support that people may actively *not* want, and that 
edismax may have quirks dismax doesn't that we have yet to discover and don't 
realize because the overall test coverage is low and the EDismaxQParse is so 
much more significantly complex and there are so many weird edge cases.

But sure: if SOLR-3086 makes it possible to configure EDisMaxQParser to behave 
the same as DisMaxQParser, and if we feel confident through testing that (when 
configured as such) they behave the same, i've won't have any objections what 
soever to retiring the DisMaxQParser class for simplifying code maintence.

bq. Personally I don't think we should worry about the added features after 
edismax becomes dismax.

this part i don't understand ... even if all of the functionality ultimately 
merges and only the EDisMaxQparser remains, why should defType=dismax and 
defType=edismax suddenly become the smae thing?  why not offer two instances by 
default, "edismax" which is open and everything defaults to on, and "dismax" 
where it's more locked down like it is today?  ... what is gained by changing 
the default behavior when people use "defType=dismax"?  

(as i said before (in a slightly diff way above): would you suggest that 
defType=lucene should now be an EDisMaxQparser instance as well? with a 
CHANGES.txt note telling people that if they only want features LuceneQParser 
supported, they have to add invariant params to disable them)


> Improve extended dismax (edismax) parser
> 
>
> Key: SOLR-2368
> URL: https://issues.apache.org/jira/browse/SOLR-2368
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Yonik Seeley
>  Labels: QueryParser
>
> This is a "mother" issue to track further improvements for eDismax parser.
> The goal is to be able to deprecate and remove the old dismax once edismax 
> satisfies all usecases of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf)

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200067#comment-13200067
 ] 

Hoss Man commented on SOLR-3026:


bq. If we aim to let edismax replace dismax, people may want it to behave like 
dismax out of the box

I don't think that should be the goal.  plenty of people are using "edismax" 
already because they like the fact that it is a super set of the dismax & 
lucene features, and the defaults for "edismax" should embrace that.  

if/when EDisMaxQParser reaches the point that it can be configured to work 
exactly the same as DisMaxQParser, then it may be worth considering defaulting 
"dismax" => an EDisMaxQParser instance configured that way, but that doesn't 
mean "edismax" shouldn't expose all of it's bells and whistles by default.

uf=* as a default should be fine -- the only reason to question it would be if 
it was hard to disable, but the "-*" syntax is so easy it's not worth worrying 
about it.

> eDismax: Locking down which fields can be explicitly queried (user fields aka 
> uf)
> -
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, 
> SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end 
> user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it 
> was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3028) Support for additional query operators (feature parity request)

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1310#comment-1310
 ] 

Hoss Man commented on SOLR-3028:



#1) you can either index both stemmed and non-stemed in diff fields, and then 
specify the appropriate field name at query time for each input word to control 
what gets queried, or something like SOLR-2866 would be needed along with 
additional filters to record in the terms whether it's stemmed/unstemmed 
(possible with the payload?) so it's available at query time

#2) already possible with the standard lucene syntax: "cat doc goat"~15

#3) is already possible on trunk with the surround parser (SOLR-2703) -- 
although there isn't a lot of documentation out there about the syntax...

{code}
  {!surround}(this W that) AND (other W next) 
{code}

...it seems like the only real missing piece is some query side support for 
SOLR-2866, and it seems like that would best be tracked in SOLR-2866 right? ... 
make sure everything works all the way through the system?

> Support for additional query operators (feature parity request)
> ---
>
> Key: SOLR-3028
> URL: https://issues.apache.org/jira/browse/SOLR-3028
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 4.0
>Reporter: Mike
>  Labels: operator, queryparser
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> I'm migrating my system from Sphinx Search, and there are a couple of 
> operators that are not available to Solr, which are available in Sphinx. 
> I would love to see the following added to the Dismax parser:
> 1. Exact match. This might be tricky to get right, since it requires work on 
> the index side as well[1], but in Sphinx, you can do a query such as [ 
> =running walking ], and running will have stemming off, while walking will 
> have it on. 
> 2. Term quorum. In Sphinx and some commercial search engines (like Recommind, 
> Westlaw and Lexis), you can do a search such as [ (cat dog goat)/15 ], and 
> find the three words within 15 terms of each other. I think this is possible 
> in the backend via the span query, but there's no front end option for it, so 
> it's quite hard to reveal to users.
> 3. Word order. Being able to say, "this term before that one, and this other 
> term before the next" is something else in Sphinx that span queries support, 
> but is missing in the query parser. Would be great to get this in too.
> These seem like the three biggest missing operators in Solr to me. I would 
> love to help move these forward if there is any way I can help.
> [1] At least, *I* think it does. There's some discussion of one way of doing 
> exact match like support in SOLR-2866.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3093) Remove unused features and

2012-02-03 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199930#comment-13199930
 ] 

Hoss Man commented on SOLR-3093:


I think yonik's point is that unlike things in SOLR-1052 where existing users 
would have a reasonable expectation that the syntax would definitively do 
something (ie: use specific classes/settings), the config in this issue was 
_always_ just an optimization hint, and the system ultimately works fine even 
if/when it is ignored.

Personally i think that in these cases, it would be sufficient to WARN that 
these optimization hints are no longer used and being ignored so people can 
clean up if/when they want, but since they don't *have* to change anything to 
have a working solr instance (that still externally behaves the way it would in 
older versions of solr) there's no reason to FAIL and annoy them.



> Remove unused features  and 
> ---
>
> Key: SOLR-3093
> URL: https://issues.apache.org/jira/browse/SOLR-3093
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
> Fix For: 3.6, 4.0
>
>
> SolrConfig.java still tries to parse 
> But the only user of this param was SolrIndexSearcher.java line 366-381 which 
> is commented out.
> Probably the whole logic should be ripped out, and we fail hard if we find 
> this config option in solrconfig.xml
> Also, the  config option is old and no longer used or needed? 
> There is some code which tries to use it but I believe that since 1.4 there 
> are more efficient ways to do the same. Should we also fail-fast if found in 
> config or only print a warning?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace

2012-02-02 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199076#comment-13199076
 ] 

Hoss Man commented on SOLR-3047:


I can't make heads of tails of this bug report ... at a minimum we need to 
see...

* what the full request params look like for an example request
* what the debugQuery output looks like for an example request (including the 
echoParams and query parsing info
* how the requesthandler in use is configured
* the fieled and filedtype information for every field used by dismax (ie: 
mentioned in the request params or request handler defaults)

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses 
> KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
>  Issue Type: Bug
>Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It 
> drops it without warning - IMO it should error out if the field isn't 
> compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, 
> so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue

2012-02-01 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198361#comment-13198361
 ] 

Hoss Man commented on SOLR-3085:


bq. Then, one way may be to subtract the MM count accordingly, so that in our 
case above, when we detect that the DisMax clause for "the" does not contain 
"title_en", we do mm=mm-1 which will give us an MM of 1 instead of 2 and we'll 
get hits. This is probably the easiest solution.

that wouldn't make any sense ... in your example that would result in the query 
matching every doc containing "alltags:the" (or "title_en:contract", or 
"alltags:contract") which hardly seems like what the user is likely to expect 
if they used mm=100% (with or w/o a "mm.sw=false" param)

bq. Another way would be to keep mm as is, and move the affected clause out of 
the BooleanQuery and add it as a BoostQuery instead?

something like that might work .. but i haven't thought it through very hard 
... i have a nagging feeling that there are non-stopword cases that would be 
indistinguishable (to the parser) from this type of stopword case, and thus 
would also trigger this logic undesirably, but i can't articulate what they 
might be off the top of my head.

> Fix the dismax/edismax stopwords mm issue
> -
>
> Key: SOLR-3085
> URL: https://issues.apache.org/jira/browse/SOLR-3085
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Jan Høydahl
>  Labels: MinimumShouldMatch, dismax, stopwords
> Fix For: 3.6, 4.0
>
>
> As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here 
> http://search-lucene.com/m/Yne042qEyCq1 and here 
> http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if 
> not all fields used in QF have exactly same stopword lists.
> Typical solution is to not use stopwords or harmonize stopword lists across 
> all fields in your QF, or relax the MM to a lower percentag. Sometimes these 
> are not acceptable workarounds, and we should find a better solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser

2012-02-01 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198340#comment-13198340
 ] 

Hoss Man commented on SOLR-2368:


bq. are there any blockers left for retiring the old dismax parser?

As i've mentioned before, I don't think DismaxQParser should ever be retired 
... i'm still not convinced that the (default) parser you get when using 
"defType=dismax" should change to ExtendedDismaxQParser instance

My three main reasons for (still) feeling this way are:
* I see no advantage to changing what QParser you get (by default) when asking 
for "dismax" ... not when it's so easy for new users (or old users who want to 
switch) to just use "edismax" by name. (or explicitly declare their own 
instance of ExtendedDismaxQParser with the name "dismax" if that's what they 
always want)
* ExtendedDismaxQParser is a significantly more complex beast then 
DismaxQParser, and likely to have a lot of little quirks (and bugs) that no one 
has really noticed yet.  For people who are happy with DismaxQParser, we should 
leave well enough alone.
* Even with things like SOLR-3026 allowing you to disable field specific 
queries, ExtendedDismaxQParser still supports more types of queries/syntax then 
DismaxQParser (ie: fuzzy queries, prefix queries, wildcard queries, etc...) 
which may have performance impacts on existing dismax users, many of whom 
probably don't want to start allowing from their users -- particularly 
considering that limited syntax w/o metacharacters was a major advertised 
advantage of using dismax from day 1.

Please note: i have no tangible objection to smiley's suggestion that...

bq. defType should default to ... [edismax] in Solr 4

...if folks think that the ExtendedDismaxQParser would make a better default 
then the LuceneQParser moving forward, i've got no objection to that -- but if 
someone explicitly asks for "defType=dismax" by name, that should be the 
DismaxQParser (and it's limited syntax) ... ExtendedDismaxQParser is a 
completely different animal.  

saying defType=dismax should return an ExtendedDismaxQParser makes as much 
sense to me as saying that defType=lucene should return an 
ExtendedDismaxQParser -- just because the legal syntax of edismax is a super 
set of dismax/lucene doesn't mean they are equivalent or that we should assume 
"it's better" for people who ask for a specific QParser by name.

> Improve extended dismax (edismax) parser
> 
>
> Key: SOLR-2368
> URL: https://issues.apache.org/jira/browse/SOLR-2368
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Yonik Seeley
>  Labels: QueryParser
>
> This is a "mother" issue to track further improvements for eDismax parser.
> The goal is to be able to deprecate and remove the old dismax once edismax 
> satisfies all usecases of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue

2012-02-01 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198104#comment-13198104
 ] 

Hoss Man commented on SOLR-3085:


bq. So we get a required DisMax Query for alltags:the which does not match any 
docs. 

I think you are missreading that output...

{code{
+( ( DisjunctionMaxQuery((alltags:the)~0.01) 
 DisjunctionMaxQuery((title_en:contract | alltags:contract)~0.01)
   )~2
 )
{code}

The "DisjunctionMaxQuery((alltags:the)~0.01)" clause is not required in that 
query.  it is one of two SHOULD clauses in a boolean query, and becomes subject 
to the same "mm" rule.  both clauses in that BooleanQuery are already SHOULD 
clauses, so i don't know what it would mean to make then more "optional".



> Fix the dismax/edismax stopwords mm issue
> -
>
> Key: SOLR-3085
> URL: https://issues.apache.org/jira/browse/SOLR-3085
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Jan Høydahl
>  Labels: MinimumShouldMatch, dismax, stopwords
> Fix For: 3.6, 4.0
>
>
> As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here 
> http://search-lucene.com/m/Yne042qEyCq1 and here 
> http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if 
> not all fields used in QF have exactly same stopword lists.
> Typical solution is to not use stopwords or harmonize stopword lists across 
> all fields in your QF, or relax the MM to a lower percentag. Sometimes these 
> are not acceptable workarounds, and we should find a better solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3033) "numberToKeep" on replication handler does not work with "backupAfter"

2012-02-01 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198076#comment-13198076
 ] 

Hoss Man commented on SOLR-3033:


bq. Without a way to declare a default in solrconfig.xml, the user has no way 
to use this parameter should a backup be triggered by "backupAfter". 

right -- my point is we already have a convention for specifying "default 
params for a request handler" but your patch doesn't use that convention.

bq. We don't have a  section for request parameters, do we?

Any request handler that subclasses RequestHandlerBase automatically gets 
defaults applied when handleRequest is called if they are specified in the 
configs (the syntax isn't "" it's ".)

bq. If we kept it as a request-param only, but then let the user specify 
defaults, would that create a legal  and  section 
nested within  and , so users can specify defaults for each?

I'm not sure that would really make sense .. what if an instances was acting as 
a repeater so it's both a master and a slave?  if you told it to create a 
backup, how many would it keep if there was a differnet value specified in the 
master/slave sections?

I think maybe you've hit the nail on the head here...

{quote}
And looking at the available request parameters, we probably wouldn't want 
defaults for any of them 
...
This makes me wonder if my first try was a mistake. Possibly this should only 
be an init-param.
{quote}

So perhaps the way forward is...

* keep the "numberToKeep" request param around for backcompat with Solr 3.5 for 
people who want to manually specify it when triggering command=backups
* add a new init param for ReplicationHandler to specify how many backups to 
keep when backups are made -- the name for this new param should probably _not_ 
be numberToKeep (suggestion: "maxNumberOfBackups") because:
** we need a name that clarifies it's specific to backups
** we want a name that is distinct from the request param so in docs it's clear 
which one is being refered to
* document clearly the interaction between the maxNumberOfBackups init param 
and the numberToKeep request param (suggestion: "the numberToKeep request param 
can be used with the backup command unless the maxNumberOfBackups init param 
has been specified on the handler -- in which case maxNumberOfBackups is always 
used and attempts to use the numberToKeep request param will cause an error"

what do you think?

> "numberToKeep" on replication handler does not work with "backupAfter"
> --
>
> Key: SOLR-3033
> URL: https://issues.apache.org/jira/browse/SOLR-3033
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 3.5
> Environment: openjdk 1.6, linux 3.x
>Reporter: Torsten Krah
> Attachments: SOLR-3033.patch
>
>
> Configured my replication handler like this:
>
>
>startup
>commit
>optimize
> name="confFiles">elevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt
>optimize
>1
>  
>
> So after optimize a snapshot should be taken, this works. But numberToKeep is 
> ignored, snapshots are increasing with each call to optimize and are kept 
> forever. Seems this settings have no effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1052) Deprecate/Remove in favor of in solrconfig.xml

2012-01-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197487#comment-13197487
 ] 

Hoss Man commented on SOLR-1052:


bq. This is how config deprecations should work in my opinion. No need to 
advertise to new users the use of a syntax that we want to go away. It would be 
more confusing for the C) people to see deprecation warnings being printed OOTB 
from their brand new search engine without knowing how to fix it

+1

Patch looks good to me.  my one suggestion, since there seems to be consensus 
that solr should complain louder when there are config errors, is that instead 
of removing the existing "warn" calls on those already deprecated Solr 1.x 
legacy conf syntax, why not leave in those checks but the "warn(...)" calls 
with "throw new SolrException(...)" ?


> Deprecate/Remove  in favor of  in solrconfig.xml
> --
>
> Key: SOLR-1052
> URL: https://issues.apache.org/jira/browse/SOLR-1052
> Project: Solr
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Jan Høydahl
>Priority: Minor
>  Labels: solrconfig.xml
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1052-3x.patch
>
>
> Given that we now handle multiple cores via the solr.xml and the discussion 
> around  and  at 
> http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
> We should deprecate/remove the use of indexDefaults and just rely on 
> mainIndex, as it doesn't seem to serve any purpose and is confusing to 
> explain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3033) numberToKeep on replication handler does not work - snapshots are increasing beyond configured maximum

2012-01-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197387#comment-13197387
 ] 

Hoss Man commented on SOLR-3033:


James: two things to think about.

1) when adding new test configs, try to keep them as minimal as possible, so 
the only things in them are things that *have* to be there for the purposes of 
the test. 

2) there are really two types of "params" when dealing with request handlers -- 
init params (ie: things in the body of the requestHandler tag in 
solrconfig.xml) and request params (things passed to the handler when it is 
executed.  via RequestHandlerBase many request handlers support the idea of 
_init_ params named "defaults", "invariants" and "appends" which can contain 
sub-params that are consulted when parsing/processing _request_ params in 
handleRequest.

In the case of the "numberToKeep", this is already a _request_ param, and 
ReplicationHandler already subclasses RequestHandlerBase which means people can 
define a "defaults" section in their ReplicationHandler config so any requests 
to "http://master_host:port/solr/replication?command=backup"; get that value 
automaticly.  but your patch seems to add support for an _init_ param with the 
same name: which raises questions like "what happens if i specify differnet 
values for numberToKeep in init params and in invariant params?"

it seems like the crux of the problem is that if you use the "backupAfter" 
option, the code path to create the backup bypasses a lot of the logic that is 
normally used when a backup command is processed via handleRequest.  So  
instead of adding an init param version of numberToKeep, perhaps it owuld be 
better if the "backupAfter" codepath followed the same code path as 
handleRequest as much as possible?  perhaps it could be something as straight 
forward as changing the meat of getEventListener to look like...

{code}
SolrQueryRequest req = new LocalSolrRequest(core ...);
try {
  RequestHandler.this.handleRequest(req, new SolrQueryResponse());
} finally {
  req.close();
}
{code}

what do you think?

> numberToKeep on replication handler does not work - snapshots are increasing 
> beyond configured maximum
> --
>
> Key: SOLR-3033
> URL: https://issues.apache.org/jira/browse/SOLR-3033
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 3.5
> Environment: openjdk 1.6, linux 3.x
>Reporter: Torsten Krah
> Attachments: SOLR-3033.patch
>
>
> Configured my replication handler like this:
>
>
>startup
>commit
>optimize
> name="confFiles">elevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt
>optimize
>1
>  
>
> So after optimize a snapshot should be taken, this works. But numberToKeep is 
> ignored, snapshots are increasing with each call to optimize and are kept 
> forever. Seems this settings have no effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3060) add highlighter support to SurroundQParserPlugin

2012-01-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197168#comment-13197168
 ] 

Hoss Man commented on SOLR-3060:


This patch is straight forward and includes tests (thank you so much for the 
tests).  

The meat of the change is that getHighlightQuery is overridden to attempt query 
rewriting, which gives me two concerns...

1) at a minimum i'm pretty sure super.getHighlightQuery still needs to be 
called.

2) is this rewriting of hte query done in the SurroundQParser going to cause 
any problems or unexpected behavior in conjunction with the highlighter 
component logic that already decides if/when to rewrite the query?

If the crux of the problem is that HighlightComponent rewrites the query 
automatically _except_ when using the phrase highlighter with the multi-term 
option (assuming i'm reading the code correctly) then shouldn't that code path 
of the highlighter be modified to do something sane with any type of Query 
object? ... why isn't it responsible for calling rewrite on any sub-query of a 
type it doesn't understanding?

(Highlighting is one of the areas of Lucene/Solr that frequently makes my head 
hurt, so forgive me if these are silly questions)

> add highlighter support to  SurroundQParserPlugin
> -
>
> Key: SOLR-3060
> URL: https://issues.apache.org/jira/browse/SOLR-3060
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 4.0
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3060.patch, SOLR-3060.patch
>
>
> Highlighter does not recognize SrndQuery family.
> http://search-lucene.com/m/FuDsU1sTjgM
> http://search-lucene.com/m/wD8c11gNTb61

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3034) replicateAfter optimize not working

2012-01-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197119#comment-13197119
 ] 

Hoss Man commented on SOLR-3034:


FWIW: replicateAfter=commit is a super set of replicateAfter=optimize because 
every optimize command is also a commit commant ... if you are replicating 
after every commit then you are automatically replicating after every optimize. 
 that's the reason the UI only shows "commit, startup", it's "deduping" the 
list.

However: that wouldn't explain why you aren't seeing replication happen after 
an optimize.  Are you seeing replication happen at all?

I just tried a quick sanity check using trunk r1237878 with the example 
modified to act as a master with replicateAfter commit & optimize and i was 
definitely seeing http://localhost:8983/solr/replication?command=indexversion 
return a new indexversion after every commit or optimize.

when i changed the config to *only* use replicateAfter=optimize, then 
indexversion would return a new version after every optimize command, but not 
after every commit 

...so things are working exactly as expected on the master side from what i can 
see.




> replicateAfter optimize not working
> ---
>
> Key: SOLR-3034
> URL: https://issues.apache.org/jira/browse/SOLR-3034
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Antony Stubbs
>
> I have:
> {noformat}
>   optimize
>   commit
>   startup
> {noformat}
> But the UI only shows:
> {noformat}
> replicateAfter:commit, startup
> {noformat}
> And sure enough, optimizing does not cause a replication to happen. 
> Also, replicating an optimized index, does not seem to keep in "optimized" on 
> the slave. Is that really the case, or is it a bug? I would expect if an 
> index is optimized on master, when it is then replicated to slaves, the 
> slaves would receive the optimized index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >