[jira] [Commented] (LUCENE-3982) regex support in queryparser needs documented, and called out in CHANGES.txt
[ https://issues.apache.org/jira/browse/LUCENE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253908#comment-13253908 ] Hoss Man commented on LUCENE-3982: -- Not: set to blocker so we don't release 4.0 with this change in syntax w/o documenting > regex support in queryparser needs documented, and called out in CHANGES.txt > > > Key: LUCENE-3982 > URL: https://issues.apache.org/jira/browse/LUCENE-3982 > Project: Lucene - Java > Issue Type: Sub-task > Components: core/queryparser >Reporter: Hoss Man >Priority: Blocker > Fix For: 4.0 > > > Spun off of LUCENE-2604 where everyone agreed this needed done, but no one > has done it yet, and rmuir didn't want to leave the issue open... > {quote} > some issues were pointed out in a recent mailing list thread that definitely > seem like they should be addressed before this is officially released... > * queryparsersyntax.xml doesn't mention this feature at all -- as major new > syntax is should really get it's own section with an example showing the > syntax > * queryparsersyntax.xml's section on "Escaping Special Characters" needs to > mention that '/' is a special character > Also: Given that Yury encountered some real world situations in which the new > syntax caused problems with existing queries, it seems like we should > definitely make a note about this possibility more promonient ... i'm not > sure if it makes sense in MIGRATE.txt but at a minimum it seems like the > existing CHANGES.txt entry should mention it, maybe something like... > {noformat} > * LUCENE-2604: Added RegexpQuery support to QueryParser. Regular expressions > are now directly supported by the standard queryparser using the syntax... > fieldName:/expression/ OR /expression against default field/ > Users who wish to search for literal "/" characters are advised to > backslash-escape or quote those characters as needed. > (Simon Willnauer, Robert Muir) > {noformat} > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3330) Show changes in plugin statistics across multiple requests
[ https://issues.apache.org/jira/browse/SOLR-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253754#comment-13253754 ] Hoss Man commented on SOLR-3330: Sorry .. i was looking at the solr logging (SolrCore.execute) ... because it's using "Content-Type: application/x-www-form-urlencoded" and the stream.body param, it's all being included in the list of SolrParams that get logged. So my concern about extra long URLs breaking is a non-issue, but it's still kind of noisy as far as solr logging goes. If it was changed to use "Content-Type: application/xml" and send the xml directly then it wouldn't be counted as a solr param, but the handler would still get it as a ContentStream. --- as for how it looks: in my initial impression i didn't realize that it was recording values core all the categories of plugins (ie: i was looking at "Query Handlers" and didn't notice the little grey numbers indicating that "Caches" also had some changes) ... the #BBA500 color used to make the plugin names with changes standout is great (even if you remove the "new" icon completely) so maybe just using that same color on the category names (or at least the little numbers indicating that items in that category have changed) would be helpful to draw attention to them? > Show changes in plugin statistics across multiple requests > -- > > Key: SOLR-3330 > URL: https://issues.apache.org/jira/browse/SOLR-3330 > Project: Solr > Issue Type: New Feature > Components: web gui >Reporter: Ryan McKinley > Fix For: 4.0 > > Attachments: SOLR-3330-pluggins-diff.patch, > SOLR-3330-pluggins-diff.patch, SOLR-3330-plugins.png, > SOLR-3330-record-changes-ui.patch, SOLR-3330-record-changes-ui.patch > > > When debugging configuration and performance, I often: > 1. Look at stats values > 2. run some queries > 3. See how the stats values changed > This is fine, but is often a bit clunky and you have to really know what you > are looking for to see any changes. > It would be great if the 'plugins' page had a button that would make this > easier -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2605) CoreAdminHandler, different Output while 'defaultCoreName' is specified
[ https://issues.apache.org/jira/browse/SOLR-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253736#comment-13253736 ] Hoss Man commented on SOLR-2605: Stefan: are you sure you had a clean build with my patch applied? when i run... {noformat} java -DzkRun -Dcollection.configName=myconf -Dbootstrap_confdir=./solr/conf -Dsolr.environment=dev -Duser.timezone=UTC -DhostPort=8983 -Djetty.port=8983 -jar start.jar {noformat} I get... {noformat} hossman@bester:~/lucene/dev/solr$ curl "http://localhost:8983/solr/zookeeper?detail=true&path=%2Fclusterstate.json"; {"znode":{ "path":"/clusterstate.json","prop":{ "version":5, "aversion":0, "children_count":0, "ctime":"Fri Apr 13 20:27:46 UTC 2012 (1334348866331)", "cversion":0, "czxid":12, "dataLength":290, "ephemeralOwner":0, "mtime":"Fri Apr 13 20:45:41 UTC 2012 (1334349941866)", "mzxid":207, "pzxid":12}, "data":"{\"collection1\":{\"shard1\":{\"bester:8983_solr_collection1\":{\n \"shard\":\"shard1\",\n\"leader\":\"true\",\n \"state\":\"active\",\n\"core\":\"collection1\",\n \"collection\":\"collection1\",\n\"node_name\":\"bester:8983_solr\",\n \"base_url\":\"http://bester:8983/solr\""},"tree":[{"data":{ "title":"/clusterstate.json","attr":{ "href":"zookeeper?detail=true&path=%2Fclusterstate.json"}}}]} {noformat} > CoreAdminHandler, different Output while 'defaultCoreName' is specified > --- > > Key: SOLR-2605 > URL: https://issues.apache.org/jira/browse/SOLR-2605 > Project: Solr > Issue Type: Improvement > Components: web gui >Reporter: Stefan Matheis (steffkes) >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2399-admin-cores-default.xml, > SOLR-2399-admin-cores.xml, SOLR-2605.patch, SOLR-2605.patch > > > The attached XML-Files show the little difference between a defined > {{defaultCoreName}}-Attribute and a non existing one. > Actually the new admin ui checks for an core with empty name to set single- / > multicore-settings .. it's a quick change to count the number of defined > cores instead. > But, will it be possible, to get the core-name (again)? One of both > attributes would be enough, if that makes a difference :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3330) Show changes in plugin statistics across multiple requests
[ https://issues.apache.org/jira/browse/SOLR-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253703#comment-13253703 ] Hoss Man commented on SOLR-3330: can we change this to use an HTTP POST body instead of stream.body request param? ... it's sending some really long ass request URLs that might not work if a servlet container is configured to limit the URL length. > Show changes in plugin statistics across multiple requests > -- > > Key: SOLR-3330 > URL: https://issues.apache.org/jira/browse/SOLR-3330 > Project: Solr > Issue Type: New Feature > Components: web gui >Reporter: Ryan McKinley > Fix For: 4.0 > > Attachments: SOLR-3330-pluggins-diff.patch, > SOLR-3330-pluggins-diff.patch, SOLR-3330-plugins.png, > SOLR-3330-record-changes-ui.patch, SOLR-3330-record-changes-ui.patch > > > When debugging configuration and performance, I often: > 1. Look at stats values > 2. run some queries > 3. See how the stats values changed > This is fine, but is often a bit clunky and you have to really know what you > are looking for to see any changes. > It would be great if the 'plugins' page had a button that would make this > easier -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3978) redo how our download redirect pages work
[ https://issues.apache.org/jira/browse/LUCENE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253126#comment-13253126 ] Hoss Man commented on LUCENE-3978: -- Uwe: if i'm understanding that page correctly, this would only be possible for links where: a) link html is on our site b) we can control the html used to generate them ...which isfine for the bug buttons on lucene.apache.org, and any other download links we might want to include on those CMS pages, but not for things like links from wiki.apache.org, or the URLs we include in our plain text release announcement emails (that users just cut/paste) or that we submit to any other site to promote the release. > redo how our download redirect pages work > - > > Key: LUCENE-3978 > URL: https://issues.apache.org/jira/browse/LUCENE-3978 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man > Fix For: 4.0 > > > the download "latest" redirect pages are kind of a pain to change when we > release a new version... > http://lucene.apache.org/core/mirrors-core-latest-redir.html > http://lucene.apache.org/solr/mirrors-solr-latest-redir.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3327) Logging UI should indicate which loggers are set vs implicit
[ https://issues.apache.org/jira/browse/SOLR-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252954#comment-13252954 ] Hoss Man commented on SOLR-3327: bq. It should state on top that these are the JDK logging levels. If people switch logging through SLF4J it won't work i wonder if there is a way for the LoggingServlet (request handler?) to detect which SL4J binding is in use, and spit out a warning if it's not JDK, so the UI can conditionally display that warning if it exists. > Logging UI should indicate which loggers are set vs implicit > > > Key: SOLR-3327 > URL: https://issues.apache.org/jira/browse/SOLR-3327 > Project: Solr > Issue Type: Improvement > Components: web gui >Reporter: Ryan McKinley >Priority: Trivial > Fix For: 4.0 > > Attachments: SOLR-3327.patch, logging.png > > > The new logging UI looks great! > http://localhost:8983/solr/#/~logging > It would be nice to indicate which ones are set explicitly vs implicit -- > perhaps making the line bold when set=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3977) generated/duplicated javadocs are wasteful and bloat the release
[ https://issues.apache.org/jira/browse/LUCENE-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252944#comment-13252944 ] Hoss Man commented on LUCENE-3977: -- bq. Really if we have different modules like contrib-analyzers, why can't they link to the things they depend on (e.g. lucene-core) just like the solr javadocs do? i think the original argument in favor of having both styles was: * the all version makes it easy to see (in the left pane) all the classes that are available when people are working with the entire code base * the individual module versions, even when cross linked with eachother, make it easy to see exactly what is included in a single module (via the left pane) at this point in my life, i don't really have an opinion, as long as we include at least one copy in the bin release. bq. We can save 10MB with this patch, which nukes the 'index oh god yes, i didn't even realize we were building that useless pile of crap > generated/duplicated javadocs are wasteful and bloat the release > > > Key: LUCENE-3977 > URL: https://issues.apache.org/jira/browse/LUCENE-3977 > Project: Lucene - Java > Issue Type: Bug > Components: general/javadocs >Reporter: Robert Muir >Priority: Blocker > Fix For: 4.0 > > > Some stats for the generated javadocs of 3.6: > * 9,146 files > * 161,872 KB uncompressed > * 25MB compressed (this is responsible for nearly half of our binary release) > The fact we intentionally double our javadocs size with the 'javadocs-all' > thing > is truly wasteful and compression doesn't help at all. Just testing, i nuked > 'all' > and found: > * 4,944 files > * 81,084 KB uncompressed > * 12.8MB compressed > We need to clean this up for 4.0. We only need to ship javadocs 'one way'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3978) redo how our download redirect pages work
[ https://issues.apache.org/jira/browse/LUCENE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252940#comment-13252940 ] Hoss Man commented on LUCENE-3978: -- when we released 3.6, we ran into a few annoyances... * these pages require that you edit the template (not availbale in the bookmarklet) to change the 3.5.0 to 3.6.0 in the final URL * these pages were in browser caches, so they weren't seeing the cahnges in the javascript redirect (rmuir added some no-cache metadata headers, so hopefully this won't be a problem again) My suggestion for the future... * eliminate these templates and their mdtext pages entirely * replace them with a .htaccess redirect rule that looks like: {{/([^/*)/(.*)-latest-redir.html /$1/$2-redir.html?3.6.0}} * update the templates for mirrors-solr-redir.mdtext and mirrors-core-redir.mdtext so that the javascript will use the query string when building the final URL ...that way whenever we release a new version, we can just tweak the .htaccess rule, and the only "html pages" that might ever show up in an http or browser caches will have unique URLs per version. > redo how our download redirect pages work > - > > Key: LUCENE-3978 > URL: https://issues.apache.org/jira/browse/LUCENE-3978 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Hoss Man > Fix For: 4.0 > > > the download "latest" redirect pages are kind of a pain to change when we > release a new version... > http://lucene.apache.org/core/mirrors-core-latest-redir.html > http://lucene.apache.org/solr/mirrors-solr-latest-redir.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252851#comment-13252851 ] Hoss Man commented on LUCENE-3973: -- bq. I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about. Unless you run into the same taskdef/classloader/sub-build/permgen-OOM problem we had with clover, and the maven-ant-tasks, and ivy that have prevented us from doing the same thing with them. > Incorporate PMD / FindBugs > -- > > Key: LUCENE-3973 > URL: https://issues.apache.org/jira/browse/LUCENE-3973 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Chris Male > > This has been touched on a few times over the years. Having static analysis > as part of our build seems like a big win. For example, we could use PMD to > look at {{System.out.println}} statements like discussed in LUCENE-3877 and > we could possibly incorporate the nocommit / @author checks as well. > There are a few things to work out as part of this: > - Should we use both PMD and FindBugs or just one of them? They look at code > from different perspectives (bytecode vs source code) and target different > issues. At the moment I'm in favour of trying both but that might be too > heavy handed for our needs. > - What checks should we use? There's no point having the analysis if it's > going to raise too many false-positives or problems we don't deem > problematic. > - How should the analysis be integrated in our build? Need to work out when > the analysis should run, how it should be incorporated in Ant and/or Maven, > what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252735#comment-13252735 ] Hoss Man commented on LUCENE-3973: -- bq. How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. i would suggest going about it incrementally... * hook into build.xml as optional targets that can be run if you have the neccessary libs installed, don't fail the build just generate the XML report files * put the needed libs on builds.apache.org, and hook it into the jenkins nightly target, and configure jenkins to display it's pretty version of the xml reports so people can at least see what's going on. * start adding/tweaking custom rule sets in dev-tools to eliminate rules we don't care about, add rules we want that don't exist, or change the severity of rules we think are more/less important * tweak the build.xml to fail if anything above some arbitrary severity is tripped * worry about maven > Incorporate PMD / FindBugs > -- > > Key: LUCENE-3973 > URL: https://issues.apache.org/jira/browse/LUCENE-3973 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Chris Male > > This has been touched on a few times over the years. Having static analysis > as part of our build seems like a big win. For example, we could use PMD to > look at {{System.out.println}} statements like discussed in LUCENE-3877 and > we could possibly incorporate the nocommit / @author checks as well. > There are a few things to work out as part of this: > - Should we use both PMD and FindBugs or just one of them? They look at code > from different perspectives (bytecode vs source code) and target different > issues. At the moment I'm in favour of trying both but that might be too > heavy handed for our needs. > - What checks should we use? There's no point having the analysis if it's > going to raise too many false-positives or problems we don't deem > problematic. > - How should the analysis be integrated in our build? Need to work out when > the analysis should run, how it should be incorporated in Ant and/or Maven, > what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250910#comment-13250910 ] Hoss Man commented on SOLR-3076: As i said before... bq. ...perhaps we should focus on the more user explicit, direct mapping type QParser type approach Mikhail has already started on for now, and consider that (_schema driven implicit block joining_) as an enhancement later? (especially since it's not clear how the indexing side will be managed/enforced...) what Mikhail's fleshed out here seems like a good starting point for users who are willing to deal with this at the "low" level (similar in expertness to the "raw" QParser) , and would be usable *today* for people who take responsibility of indexing the blocks themselves. if/when/how we decide to drive the indexing side, we can think about how if/where/how to automagically hook blockjoin queries into "higher" level parsers like LuceneQParser, DismaxQueryParser > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Attachments: SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, > SOLR-3076.patch, SOLR-3076.patch, bjq-vs-filters-backward-disi.patch, > bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, > parent-bjq-qparser.patch, parent-bjq-qparser.patch, > solrconf-bjq-erschema-snippet.xml, tochild-bjq-filtered-search-fix.patch > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250196#comment-13250196 ] Hoss Man commented on SOLR-3335: ignoring the seed, and just trying the test with "-Dtests.nightly=true" i've only seen this test pass once (and i might have had a typo in that nightly param -- it was the first time i tried it and i didn't have a shell log). Unless i'm missing something... * BaseDistributedSearchTestCase.createServers initializes the following pairwise... ** protected List jettys ** protected List clients * TestDistributedSearch.doTest then... ** copies those lists into local upJettys and upClients instances and maintains a list of "upShards" ** iteratively shutsdown some number of jetty instances, removing from upJettys, upShards, and upClients ** passes upShards and upClients to queryPartialResults * TestDistributedSearch.queryPartialResults ... ** does some random quering of upShards and upClients ** if stress is non-zero (which it is if it's nightly) then it also spins up a bunch of threads using a client from the original "clients" list ...which seems fundamentally flawed to me ... because each "client" knows about a specific jetty instance, and the test has explicitly shut down some jetty instances. Is this just a typo? are the refs to "clients" in queryPartialResults all just suppose to be "upClients" ? > testDistribSearch failure > - > > Key: SOLR-3335 > URL: https://issues.apache.org/jira/browse/SOLR-3335 > Project: Solr > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.0 > > > Happened on my test machine. Is there a way to disable these tests if we > cannot fix them? There are two three tests that fail most of the time and > that apparently nobody knows how to fix (including me). > There is also a typo in the error message (I'm away from home for Easter, > can't do it now). > {noformat} > build 06-Apr-2012 16:11:54[junit] Testsuite: > org.apache.solr.cloud.RecoveryZkTest > build 06-Apr-2012 16:11:54[junit] Testcase: > testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED > build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying > build 06-Apr-2012 16:11:54[junit] > junit.framework.AssertionFailedError: There are still nodes recoverying > build 06-Apr-2012 16:11:54[junit] at > org.junit.Assert.fail(Assert.java:93) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > build 06-Apr-2012 16:11:54[junit] at > java.lang.reflect.Method.invoke(Method.java:597) > build 06-Apr-2012 16:11:54[junit] at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > build 06-Apr-2012 16:11:54[junit] at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(Syst
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250177#comment-13250177 ] Hoss Man commented on SOLR-3335: * nightly builds seem to fail almost every time * test only builds seem to pass almost every time ...are folks remembering to use "-Dtests.nightly=true" when trying to reproduce this? I tried the reproduce line from nightly build #1819 and got the same ConnectException as jenkins three times in a row... {noformat} hossman@bester:~/lucene/dev/solr$ ant test -Dtestcase=TestDistributedSearch -Dtestmethod=testDistribSearch -Dtests.seed=-64cffe89df6d3a71:-2543436b41d480f3:21aa64ce023d4a8a -Dtests.nightly=true -Dargs="-Dfile.encoding=ISO8859-1" {noformat} > testDistribSearch failure > - > > Key: SOLR-3335 > URL: https://issues.apache.org/jira/browse/SOLR-3335 > Project: Solr > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.0 > > > Happened on my test machine. Is there a way to disable these tests if we > cannot fix them? There are two three tests that fail most of the time and > that apparently nobody knows how to fix (including me). > There is also a typo in the error message (I'm away from home for Easter, > can't do it now). > {noformat} > build 06-Apr-2012 16:11:54[junit] Testsuite: > org.apache.solr.cloud.RecoveryZkTest > build 06-Apr-2012 16:11:54[junit] Testcase: > testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED > build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying > build 06-Apr-2012 16:11:54[junit] > junit.framework.AssertionFailedError: There are still nodes recoverying > build 06-Apr-2012 16:11:54[junit] at > org.junit.Assert.fail(Assert.java:93) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) > build 06-Apr-2012 16:11:54[junit] at > org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > build 06-Apr-2012 16:11:54[junit] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > build 06-Apr-2012 16:11:54[junit] at > java.lang.reflect.Method.invoke(Method.java:597) > build 06-Apr-2012 16:11:54[junit] at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > build 06-Apr-2012 16:11:54[junit] at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > build 06-Apr-2012 16:11:54[junit] at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) > build 06-Apr-2012 16:11:54[junit] at > org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) > build 06-Apr-2012 16:11:54[junit] at > org.junit.rules.RunRules.evaluate(RunRules.java:18) > build 06-Apr-2012 16:11:54[junit] at > org.junit.runners.P
[jira] [Commented] (SOLR-3329) Use consistent svn:keywords
[ https://issues.apache.org/jira/browse/SOLR-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247959#comment-13247959 ] Hoss Man commented on SOLR-3329: These were all really designed originally for people writting plugins to be able to expose more information for their consumers about them that might not be obvious based on the global info about the Solr intall. As for stuff in the solr source tree, i would suggest.. * getSource() - keep using $URL$, it doesn't really hurt anything. * getVersion() - we should just start returning the implementaiton version from the package metadata * getSourceId() - $Id$ is the most problematic svn keyword i've ever seen, lets just drop it and leave this blank in all the core mbeans ... plugin writers can use it however they want > Use consistent svn:keywords > --- > > Key: SOLR-3329 > URL: https://issues.apache.org/jira/browse/SOLR-3329 > Project: Solr > Issue Type: Improvement >Reporter: Ryan McKinley > Fix For: 4.0 > > > In solr, use use svn:keywords haphazardly > We have lots of places with: > {code} > svn propset svn:keywords "Date Author Id Revision HeadURL" *.java > {code} > In LUCENE-3923, there is a suggestion to get rid of many of these. > The MBeans interface often exposes HeadURL, but we likely want to get rid of > the rest -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3961) don't build and rebuild jar files for dependencies in tests
[ https://issues.apache.org/jira/browse/LUCENE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247801#comment-13247801 ] Hoss Man commented on LUCENE-3961: -- bq. We currently don't generally use jars as the actual classpath for testing though understood, #1 is just an argument i've seen as to why it would be better to do so -- otherwise we never actually know when testing that our jars are useful -- someone could accidentally put "excludes="*.class" on a jar task and you'd never notice because all the tests would still pass. bq. by never creating a jar in the first place your #2 doesn't happen at all really. note step #a ... the point is if someone does whatever officially blessed step there is to build the jars ("ant", "ant jar", "ant whatever") and then decides they want to change the behavior of those jars -- they may never run "ant clean" and it may not occur to then to re-run whatever that official way to build jars is and they may not notice that the jar's aren't rebuilt when they do "ant test" -- because they can already see the new code was "compiled" and running based on the test output. bq. Also, if we were to go with your logic, really we should be rebuilding the solr.war everytime correct, a war is just a jar with a special structure bq. (I'm just pointing out why i think its infeasible). ... I think we need to keep this stuff fast so that compile-test-debug lifecycle is as fast as possible agreed ... like i said, i don't have a strong opinion about it, but since we're discussing it i just wanted to point out the arguments i've heard over and over when having this discussion in the past on other projects. I think in an ideal world, devs could run fast tests against ../*/classes/ directories, but jenkins would run all those same tests against fully build jars to ensure they aren't missing anything ... but that would probably be an anoying build.xml to maintain > don't build and rebuild jar files for dependencies in tests > --- > > Key: LUCENE-3961 > URL: https://issues.apache.org/jira/browse/LUCENE-3961 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Robert Muir > Fix For: 4.0 > > > Hossman's comments about when jars are built had me thinking, > its not really great how dependencies are managed currently. > say i have contrib/hamburger that depends on contrib/cheese > if I do 'ant test' in contrib/hamburger, you end out with a situation > where you have no hamburger.jar but you have a cheese.jar. > The reason for this: i think is how we implement the contrib-uptodate, > via .jar files. I think instead contrib-uptodate shouldnt use actual > jar files (cheese.jar) but a simple file we 'touch' like cheese.compiled. > This will make the build faster, especially I think the solr tests > which uses these dependencies across a lot of lucene modules. we won't > constantly jar their stuff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3328) executable bits of shellscripts in solr source release
[ https://issues.apache.org/jira/browse/SOLR-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247783#comment-13247783 ] Hoss Man commented on SOLR-3328: bq. I don't believe that. and to be clear: * these *.sh files are executable if you "unzip" the solr.zip on a unix box * these *.sh files are executable if you "tar -xzf" the solr.tgz on a unix box * it is only if you "tar -xzf" the solr-src.txt thta these files are not executable bq. I don't know if we can improve this? Maybe its an svn prop? these files already have the svn:executable property set... {noformat} hossman@bester:~/lucene/3x_dev/solr/example/exampledocs$ svn propget svn:executable post.sh test_utf8.sh post.sh - * test_utf8.sh - * {noformat} ...so it must either be something about how we do the export, or we are not telling the tar task to track the perms properly (i'm guessing the later) > executable bits of shellscripts in solr source release > -- > > Key: SOLR-3328 > URL: https://issues.apache.org/jira/browse/SOLR-3328 > Project: Solr > Issue Type: Improvement > Components: Build >Reporter: Robert Muir > Fix For: 4.0 > > > HossmanSays: in the solr src releases, some shell scripts are not executable > by default. > I don't know if we can improve this? Maybe its an svn prop? > Maybe something needs to be specified to the tar/zip process? > Currently the 'source release' is really an svn export... > Personally i always do 'sh foo.sh' rather than './foo.sh', > but if it makes it more user-friendly we should figure it out > Just opening the issue since we don't forget about it, I think solr cloud > adds some more shell scripts so we should at least figure out what we want to > do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3961) don't build and rebuild jar files for dependencies in tests
[ https://issues.apache.org/jira/browse/LUCENE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247778#comment-13247778 ] Hoss Man commented on LUCENE-3961: -- I don't have a strong opinion about this, but there are two counter arguments i've heard from over the years made in favor of *always* building the jar(s) even though it's a bit slower for the tests... 1) it mean you always test against hte same jars that you ship -- so there is no risk that the classpath you build for testing is subtly different then the files that make it into the jar (ie: maybe contrib/cheeseburger/build.xml copies a cheese_types.xml file into it's classes dir, but it accidentally gets excluded from contrib-cheeses.jar 2) it means less risk that someone accidentally uses an older jar then they thing... a) "ant something" ... builds contrib-hamburger.jar and contrib-cheese.jar b) you realize it doesn't work the way you want, so you apply a patch (with tests!) c) "ant test" rebuilds contrib/*/classes and you see your new hamburger test passes d) you copy contrib-hamburger.jar and contrib-cheese.jar not realizing they are still left over from #a above, and don't have your patch. > don't build and rebuild jar files for dependencies in tests > --- > > Key: LUCENE-3961 > URL: https://issues.apache.org/jira/browse/LUCENE-3961 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Robert Muir > Fix For: 4.0 > > > Hossman's comments about when jars are built had me thinking, > its not really great how dependencies are managed currently. > say i have contrib/hamburger that depends on contrib/cheese > if I do 'ant test' in contrib/hamburger, you end out with a situation > where you have no hamburger.jar but you have a cheese.jar. > The reason for this: i think is how we implement the contrib-uptodate, > via .jar files. I think instead contrib-uptodate shouldnt use actual > jar files (cheese.jar) but a simple file we 'touch' like cheese.compiled. > This will make the build faster, especially I think the solr tests > which uses these dependencies across a lot of lucene modules. we won't > constantly jar their stuff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247348#comment-13247348 ] Hoss Man commented on LUCENE-3946: -- thanks shawn, i added that as a suggestion on the wiki... http://wiki.apache.org/lucene-java/HowToContribute#antivy moving forward if we get questions from users about ivy problems we just need to iterate and update the wiki with what works best > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3952) validate depends on compile-tools, which does too much
[ https://issues.apache.org/jira/browse/LUCENE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246685#comment-13246685 ] Hoss Man commented on LUCENE-3952: -- with commit r1309556 you can no longer "ant clean compile" from the top level of checkout... {noformat} hossman@bester:~/lucene/dev$ ant clean compile ... validate: [echo] Building spatial... validate: [echo] Building suggest... validate: [taskdef] Could not load definitions from resource lucene-solr.antlib.xml. It could not be found. [echo] License check under: /home/hossman/lucene/dev/modules BUILD FAILED /home/hossman/lucene/dev/build.xml:68: The following error occurred while executing this line: /home/hossman/lucene/dev/modules/build.xml:68: The following error occurred while executing this line: /home/hossman/lucene/dev/lucene/tools/custom-tasks.xml:22: Problem: failed to create task or type licenses Cause: The name is undefined. Action: Check the spelling. Action: Check that any custom tasks/types have been declared. Action: Check that any / declarations have taken place. Total time: 14 seconds {noformat} > validate depends on compile-tools, which does too much > -- > > Key: LUCENE-3952 > URL: https://issues.apache.org/jira/browse/LUCENE-3952 > Project: Lucene - Java > Issue Type: Bug > Components: general/build >Reporter: Robert Muir > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3952.patch > > > lucene's common-build.xml 'validate' depends on compile-tools, but some > modules like icu, kuromoji, etc have a compile-tools target (for other > reasons). > I think it should explicitly depend on common.compile-tools instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246534#comment-13246534 ] Hoss Man commented on LUCENE-3946: -- I added some starter text to http://wiki.apache.org/lucene-java/HowToContribute#antivy I also went ahead and commited the existing patch as is, minus the classpath stuff, to the trunk: r1309511. Anyone object to merging this back to 3.6? -- bq. Commenting out "rpm_mode=true" in ant.conf made it work with just "ant test" as the command. Shawn: can you try reverting your change to /etc/ant.conf and instead add "rpm_mode=false" to a new $HOME/.ant/ant.conf file and see if that works just as well? ... if so we should add it to the wiki as a suggestion > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn & src releases to verify the jars are the ones we expect
[ https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246516#comment-13246516 ] Hoss Man commented on LUCENE-3945: -- Committed revision 1309503. - trunk rmuir said on irc that he'd work on backporting to 3x for me (going to grab some lunch soon and then get on a plane) > we should include checksums for every jar ivy fetches in svn & src releases > to verify the jars are the ones we expect > - > > Key: LUCENE-3945 > URL: https://issues.apache.org/jira/browse/LUCENE-3945 > Project: Lucene - Java > Issue Type: Task >Reporter: Hoss Man > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3945.patch, LUCENE-3945.patch, LUCENE-3945.patch, > LUCENE-3945_trunk_jar_sha1.patch, LUCENE-3945_trunk_jar_sha1.patch, > LUCENE-3945_trunk_jar_sha1.patch > > > Conversation with rmuir last night got me thinking about the fact that one > thing we lose by using ivy is confidence that every user of a release is > compiling against (and likely using at run time) the same dependencies as > every other user. > Up to 3.5, users of src and binary releases could be confident that the jars > included in the release were the same jars the lucene devs vetted and tested > against when voting on the release candidate, but with ivy there is now the > possibility that after the source release is published, the owner of a domain > where these dependencies are hosted might change the jars in some way w/o > anyone knowing. Likewise: we as developers could commit an ivy.xml file > pointing to a specific URL which we then use for and test for months, and > just prior to a release, the contents of the remote URL could change such > that a JAR included in the binary artifacts might not match the ones we've > vetted and tested leading up to that RC. > So i propose that we include checksum files in svn and in our source releases > that can be used by users to verify that the jars they get from ivy match the > jars we tested against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246494#comment-13246494 ] Hoss Man commented on LUCENE-3946: -- bq. I think rather than suggesting the --noconfig option in the patch, we should just reword the text to suggest instead installing your own ant (which worked for both you and Mike) rather than using any system-installed one on Linux systems. Given that "--noconfig" or editing ant.conf to remove rpm_mode _may_ solve the problem, and that many people are likely to consider either of those things simpler to do then installing a clean version of ant (even though you and i would probably disagree) i think we should still suggest them possible fixes. > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245911#comment-13245911 ] Hoss Man commented on LUCENE-3946: -- bq. I passed --execdebug to ant, and when it fails (w/ the builtin Fedora ant) I get this: the interesting thing being that neither of those lines actually seem to contain your ivy.jar -- but when it fails for you, the java.class.path echoing that my patch adds to the ivy-check target does show ivy in _that_ classpath (even though it's clearly not the one being used to load the taskdef) ... so something in the actual Launcher class is deciding when/if to add that ivy jar to that java.class.path (which again: is clearly not hte classpath that actually seems to matter) bq. So forget about loading ivy, I think these ants shipped with linux distributions are hopelessly broken and I don't think there is a lot we can do. that's not really fair ... many distros split things up ito multiple pacakges, you probably have the core one but not some optional ones. as mike has shown it's clearly possible to get a functional ant with a fedora install, but you do have to override/edit a config setting bq. Maybe this 'compiled-on-date' is available via an ant property we can early detect? that *REALLY* smells bad ... and would go out of it's way to break things for people who might have already fixed their ant install (using "--noconfig" or edited /etc/ant.conf) I think it's enough to make the failure message say "we did our best, try --noconfig and see the URL below for more info about how your ant install may be fucked up" ... if we can show them the _correct_ classpath ant is trying to use t load ivy, to make the point clear, then great -- if not, then we rip it out of hte error message > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: LUCENE-3946.patch > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3946) improve docs & ivy verification output to explain classpath problems and mention "--noconfig"
[ https://issues.apache.org/jira/browse/LUCENE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245836#comment-13245836 ] Hoss Man commented on LUCENE-3946: -- Related... http://ant.1045680.n5.nabble.com/Ant-and-rpm-mode-td1353437.html http://stackoverflow.com/questions/1909634/why-does-ant-ignore-task-jars-in-home-ant-lib > improve docs & ivy verification output to explain classpath problems and > mention "--noconfig" > - > > Key: LUCENE-3946 > URL: https://issues.apache.org/jira/browse/LUCENE-3946 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.6 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > > offshoot of LUCENE-3930, where shawn reported... > {quote} > I can't get either branch_3x or trunk to build now, on a system that used to > build branch_3x without complaint. It > says that ivy is not available, even after doing "ant ivy-bootstrap" to > download ivy into the home directory. > Specifically I am trying to build solrj from trunk, but I can't even get > "ant" in the root directory of the checkout > to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org > SRPMs. Ant (1.7.1) and junit are installed > from package repositories. Building a checkout of lucene_solr_3_5 on the > same machine works fine. > {quote} > The root cause is that ant's global configs can be setup to ignore the users > personal lib dir. suggested work arround is to run "ant --noconfig" but we > should also try to give the user feedback in our failure about exactly what > classpath ant is currently using (because apparently ${java.class.path} is > not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245828#comment-13245828 ] Hoss Man commented on LUCENE-3930: -- Shawn: please see LUCENE-3946 and comment there about the suggested work around > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3930-skip-sources-javadoc.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, > LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930__ivy_bootstrap_target.patch, > LUCENE-3930_includetestlibs_excludeexamplexml.patch, > ant_-verbose_clean_test.out.txt, langdetect-1.1.jar, > noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245788#comment-13245788 ] Hoss Man commented on LUCENE-3943: -- "ant example" doesn't run the example, so it can't use any ephemeral classpaths that ant creates on the fly. "ant example" currently sets up the example files (ie: copying the war to where jetty will look for it) hence my point that it could copy the jars as needed in order for jetty & solr to find them (the example has to work in the binary build via "java -jar start.jar" even if users don't have any of the original build.xml files) > Use ivy cachepath and cachefileset instead of ivy retrieve > -- > > Key: LUCENE-3943 > URL: https://issues.apache.org/jira/browse/LUCENE-3943 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Chris Male > > In LUCENE-3930 we moved to resolving all external dependencies using > ivy:retrieve. This process places the dependencies into the lib/ folder of > the respective modules which was ideal since it replicated the existing build > process and limited the number of changes to be made to the build. > However it can lead to multiple jars for the same dependency in the lib > folder when the dependency is upgraded, and just isn't the most efficient way > to use Ivy. > Uwe pointed out that _when working from svn or in using src releases_ we can > remove the ivy:retrieve calls and make use of ivy:cachepath and > ivy:cachefileset to build our classpaths and packages respectively, which > will go some way to addressing these limitations -- however we still need the > build system capable of putting the actual jars into specific lib folders > when assembling the binary artifacts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245613#comment-13245613 ] Hoss Man commented on LUCENE-3943: -- bq. In my opinion, the ideal situation would be that we pass these filesets directly to the zip/tar/gz whatever in the binary release targets the one catch that occurs to me is the solr example: start.jar, the libraries jetty looks for, and the optional jar's solr load by path based on it's configuration ... we just have to make sure "ant example" takes care of putting all those jars where they need to be > Use ivy cachepath and cachefileset instead of ivy retrieve > -- > > Key: LUCENE-3943 > URL: https://issues.apache.org/jira/browse/LUCENE-3943 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Chris Male > > In LUCENE-3930 we moved to resolving all external dependencies using > ivy:retrieve. This process places the dependencies into the lib/ folder of > the respective modules which was ideal since it replicated the existing build > process and limited the number of changes to be made to the build. > However it can lead to multiple jars for the same dependency in the lib > folder when the dependency is upgraded, and just isn't the most efficient way > to use Ivy. > Uwe pointed out that _when working from svn or in using src releases_ we can > remove the ivy:retrieve calls and make use of ivy:cachepath and > ivy:cachefileset to build our classpaths and packages respectively, which > will go some way to addressing these limitations -- however we still need the > build system capable of putting the actual jars into specific lib folders > when assembling the binary artifacts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn & src releases to verify the jars are the ones we expect
[ https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245603#comment-13245603 ] Hoss Man commented on LUCENE-3945: -- #1: I know that Ivy attempts MD5 & SHA1 verification by default -- but it does that verification against checksum files located on the server, so it only offers protection against corruption in transit, not against files deliberately modified on the server. #2 i realize that the maintainers of maven repos say "all files are immutable" and that this potential risk of malicious or accidental file changes exists for all maven users -- but that's the choise of all maven users to accept that as a way of life. I'm raising this issue only to point out a discrepancy between the "confidence" we use to be able to give people who download src releases, vs what we have currently with ivy. > we should include checksums for every jar ivy fetches in svn & src releases > to verify the jars are the ones we expect > - > > Key: LUCENE-3945 > URL: https://issues.apache.org/jira/browse/LUCENE-3945 > Project: Lucene - Java > Issue Type: Task >Reporter: Hoss Man > Fix For: 3.6, 4.0 > > > Conversation with rmuir last night got me thinking about the fact that one > thing we lose by using ivy is confidence that every user of a release is > compiling against (and likely using at run time) the same dependencies as > every other user. > Up to 3.5, users of src and binary releases could be confident that the jars > included in the release were the same jars the lucene devs vetted and tested > against when voting on the release candidate, but with ivy there is now the > possibility that after the source release is published, the owner of a domain > where these dependencies are hosted might change the jars in some way w/o > anyone knowing. Likewise: we as developers could commit an ivy.xml file > pointing to a specific URL which we then use for and test for months, and > just prior to a release, the contents of the remote URL could change such > that a JAR included in the binary artifacts might not match the ones we've > vetted and tested leading up to that RC. > So i propose that we include checksum files in svn and in our source releases > that can be used by users to verify that the jars they get from ivy match the > jars we tested against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245447#comment-13245447 ] Hoss Man commented on LUCENE-3943: -- bq. In my opinion, the ideal situation would be that we pass these filesets directly to the zip/tar/gz whatever in the binary release targets Ah... ok didn't think of that... +1 > Use ivy cachepath and cachefileset instead of ivy retrieve > -- > > Key: LUCENE-3943 > URL: https://issues.apache.org/jira/browse/LUCENE-3943 > Project: Lucene - Java > Issue Type: Improvement > Components: general/build >Reporter: Chris Male > > In LUCENE-3930 we moved to resolving all external dependencies using > ivy:retrieve. This process places the dependencies into the lib/ folder of > the respective modules which was ideal since it replicated the existing build > process and limited the number of changes to be made to the build. > However it can lead to multiple jars for the same dependency in the lib > folder when the dependency is upgraded, and just isn't the most efficient way > to use Ivy. > Uwe pointed out that _when working from svn or in using src releases_ we can > remove the ivy:retrieve calls and make use of ivy:cachepath and > ivy:cachefileset to build our classpaths and packages respectively, which > will go some way to addressing these limitations -- however we still need the > build system capable of putting the actual jars into specific lib folders > when assembling the binary artifacts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3200) When using SignatureUpdateProcessor with "all fields" configuration, it will assume only the fields present on the very first document only, ignoring any optional fields
[ https://issues.apache.org/jira/browse/SOLR-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244800#comment-13244800 ] Hoss Man commented on SOLR-3200: Committed revision 1308604. - trunk still testing backport to 3x > When using SignatureUpdateProcessor with "all fields" configuration, it will > assume only the fields present on the very first document only, ignoring any > optional fields in subsequent documents in the signature generation. > -- > > Key: SOLR-3200 > URL: https://issues.apache.org/jira/browse/SOLR-3200 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.4, 3.1, 3.2, 3.3, 3.4, 3.5, 4.0 >Reporter: Spyros Kapnissis >Assignee: Hoss Man > Fix For: 3.6 > > Attachments: SOLR-3200.patch > > > This can result in non-duplicate documents being left out of the index. A > solution would be that the fields to be used in the signature generation are > recalculated with every document inserted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3226) SignatureUpdateProcessor ignores non-string field values from the signature generation
[ https://issues.apache.org/jira/browse/SOLR-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244799#comment-13244799 ] Hoss Man commented on SOLR-3226: Committed revision 1308604. - trunk ...had to make a tweak to schema-luceneMatchVersion.xml to get all tests working however (TestMatchVersions uses same solrconfig.xml but diff schema.xml, so it freaked about "id" not existing) still testing the backport to 3x ... there were some other subtle tweaks needed there to the test because of branch drift > SignatureUpdateProcessor ignores non-string field values from the signature > generation > -- > > Key: SOLR-3226 > URL: https://issues.apache.org/jira/browse/SOLR-3226 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.4, 3.1, 3.2, 3.3, 3.4, 3.5, 4.0 >Reporter: Spyros Kapnissis >Assignee: Hoss Man > Fix For: 3.6 > > Attachments: SOLR-3226.patch, SOLR-3226.patch > > > When using for example XMLUpdateRequestProcessor, the signature is calculated > correctly since all field values are strings. But when one uses > DataImportHandler or BinaryUpdateRequestHandler, the signature generation > will ignore any field values that are ints, longs, dates etc. > This might result in overwriting non-similar documents, as it happened in my > case while importing some db data through DIH. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243266#comment-13243266 ] Hoss Man commented on LUCENE-3930: -- I did some testing of the "packages" built using trunk (circa r1307608)... * we don't ship solr's build.xml (or any of the sub-build.xml files) in the "binary" artifacts, and with these changes most of the new ivy.xml files are also excluded -- but for some reason these newly added files are showing up, we should probably figure out why and exclude them as well since they aren't usable and could easily people... ** ./example/example-DIH/ivy.xml ** ./example/example-DIH/build.xml ** ./example/ivy.xml ** ./example/build.xml * the lib's for test-framework (ant, ant-junit, and junit) aren't being included in the lucene "binary" artifacts ... for the ant jars this might (test-framework doesn't actually have any run-time deps on anything in ant does it?) but it seems like hte junit jar should be included since including lucene-test-framework.jar in your classpath is useless w/o also including junit * "ant ivy-bootstrap" followed by "ant test" using the lucene "source" package (lucene-4.0-SNAPSHOT-src.tgz) produces a build failure -- but this may have been a problem even before ivy (note the working dir and the final error)... {noformat} hossman@bester:~/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT$ ant test ... [junit] Testsuite: org.apache.lucene.util.junitcompat.TestReproduceMessage [junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 0.114 sec [junit] test: compile-lucene-core: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: ivy-availability-check: ivy-fail: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [javac] Compiling 1 source file to /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java compile-core: compile-test-framework: ivy-availability-check: ivy-fail: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: compile-lucene-core: compile-core: compile-test: [echo] Building demo... ivy-availability-check: ivy-fail: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml common.init: compile-lucene-core: contrib-build.init: check-lucene-core-uptodate: jar-lucene-core: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: ivy-availability-check: ivy-fail: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [javac] Compiling 1 source file to /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java compile-core: jar-core: [jar] Building jar: /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/lucene-core-4.0-SNAPSHOT.jar init: compile-test: [echo] Building demo... check-analyzers-common-uptodate: jar-analyzers-common: BUILD FAILED /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build.xml:487: The following error occurred while executing this line: /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:1026: The following error occurred while executing this line: /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/contrib/contrib-build.xml:58: The following error occurred while executing this line: /home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:551: Basedir /home/hossman/tmp/ivy-pck-testing/lu/src/modules/analysis/common does not exist Total time: 5 minutes 10 seconds {noformat} ...it's trying to reach back up out of the working directory into "../modules" > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6 > > Attachments: LUCENE-3930-skip-sources-javadoc.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, > LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-39
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240943#comment-13240943 ] Hoss Man commented on LUCENE-3930: -- as far as the PermGen OOM goes, the -verbose logs show repeated instances of "Trying to override old definition of task antlib:org.apache.ivy.ant:..." suggesting that rmuir's {{unless="ivy.uptodate"}} isn't working the way we think it should (possibly because of the way the various build files are included in one another?) If we can't keep the {{https://ant.apache.org/manual/Tasks/typedef.html {quote} If you are defining tasks or types that share the same classpath with multiple taskdef or typedef tasks, the corresponding classes will be loaded by different Java ClassLoaders. Two classes with the same name loaded via different ClassLoaders are not the same class from the point of view of the Java VM, they don't share static variables and instances of these classes can't access private methods or attributes of instances defined by "the other class" of the same name. They don't even belong to the same Java package and can't access package private code, either. The best way to load several tasks/types that are supposed to cooperate with each other via shared Java code is to use the resource attribute and an antlib descriptor. If this is not possible, the second best option is to use the loaderref attribute and specify the same name for each and every typedef/taskdef - this way the classes will share the same ClassLoader. Note that the typedef/taskdef tasks must use identical classpath defintions (this includes the order of path components) for the loaderref attribute to work. {quote} ...it appears it's just some unique string key to name the classloader? (no idea if it whatever is causing hte current problem will still plague by creating multiple loaders with the same name)... https://svn.apache.org/viewvc/ant/core/tags/ANT_171/src/etc/testcases/core/loaderref/ > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6 > > Attachments: LUCENE-3930-solr-example.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930.patch, ant_-verbose_clean_test.out.txt > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240918#comment-13240918 ] Hoss Man commented on LUCENE-3930: -- here's the tail end of "ant -verbose clean test"... {noformat} hossman@bester:~/lucene/branch_lucene3930$ ant -verbose clean test ... compile-test-framework: Skipped because property 'lucene.test.framework.compiled' set. common.compile-test: [mkdir] Skipping /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test because it already exists. Property "run.clover" has not been set [javac] org/apache/lucene/demo/TestDemo.java omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/TestDemo.class is up to date. [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache1.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache1.1.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/apache2.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/cpl1.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/epl1.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/freebsd.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl1.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl2.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/gpl3.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lgpl2.1.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lgpl3.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/lpgl2.0.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mit.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla1.1.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla_eula_firefox3.txt skipped - don't know how to handle it [javac] /home/hossman/lucene/branch_lucene3930/lucene/contrib/demo/src/test/org/apache/lucene/demo/test-files/docs/mozilla_eula_thunderbird2.txt skipped - don't know how to handle it [copy] org/apache/lucene/demo/test-files/docs/apache1.0.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache1.0.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/apache1.1.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache1.1.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/apache2.0.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/apache2.0.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/cpl1.0.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/cpl1.0.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/epl1.0.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/epl1.0.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/freebsd.txt omitted as /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/demo/classes/test/org/apache/lucene/demo/test-files/docs/freebsd.txt is up to date. [copy] org/apache/lucene/demo/test-files/docs/g
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240912#comment-13240912 ] Hoss Man commented on LUCENE-3930: -- I did a {{touch /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-FAKE.txt}} to work around the license checker and this time i wound up with an OOM... {noformat} hossman@bester:~/lucene/branch_lucene3930$ ant clean test common.compile-test: [mkdir] Created dir: /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test [javac] Compiling 6 source files to /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [copy] Copied 1 empty directory to 1 empty directory under /home/hossman/lucene/branch_lucene3930/lucene/build/contrib/sandbox/classes/test test-contrib: [echo] Building demo... download-ivy: install-ivy: resolve: [ivy:retrieve] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml java.lang.OutOfMemoryError: PermGen space java.lang.OutOfMemoryError: PermGen space at java.lang.Throwable.getStackTraceElement(Native Method) at java.lang.Throwable.getOurStackTrace(Throwable.java:591) at java.lang.Throwable.printStackTrace(Throwable.java:462) at java.lang.Throwable.printStackTrace(Throwable.java:451) at org.apache.tools.ant.Main.startAnt(Main.java:230) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104) {noformat} running with "-verbose" to see if i can get more details on exactly where/why the OOM is happening > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6 > > Attachments: LUCENE-3930-solr-example.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930.patch > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240864#comment-13240864 ] Hoss Man commented on LUCENE-3930: -- FWIW i did a completley clean checkout of the lucene3930 (r1306662) and got the following build failure trying to run "ant clean test" from the top level. the ivy bootstraping doesn't seem to play nicely with the license checker... {noformat} hossman@bester:~/lucene/branch_lucene3930$ ant clean test Buildfile: build.xml clean: clean: clean: clean: [echo] Building analyzers-common... clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-kuromoji... clean: [echo] Building analyzers-morfologik... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building analyzers-uima... clean: [echo] Building benchmark... clean: [echo] Building facet... clean: [echo] Building grouping... clean: [echo] Building join... clean: [echo] Building queries... clean: [echo] Building queryparser... clean: [echo] Building spatial... clean: [echo] Building suggest... clean: [echo] Building solr... clean: validate: compile-tools: download-ivy: [mkdir] Created dir: /home/hossman/lucene/branch_lucene3930/lucene/ivy [echo] [echo] NOTE: You do not have ivy installed, so downloading it... [echo] If you make lots of checkouts, download ivy-2.2.0.jar yourself [echo] and set ivy.jar.file to this in your ~/build.properties [echo] [get] Getting: http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar [get] To: /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar install-ivy: resolve: [ivy:retrieve] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: url = jar:file:/home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: compile-core: [mkdir] Created dir: /home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java [javac] Compiling 2 source files to /home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java [copy] Copying 1 file to /home/hossman/lucene/branch_lucene3930/lucene/build/tools/classes/java validate: [echo] License check under: /home/hossman/lucene/branch_lucene3930/lucene [licenses] MISSING LICENSE for the following file: [licenses] /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-2.2.0.jar [licenses] Expected locations below: [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-ASL.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-BSD.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-BSD_LIKE.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-CDDL.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-CPL.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-EPL.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-MIT.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-MPL.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-PD.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-SUN.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-COMPOUND.txt [licenses] => /home/hossman/lucene/branch_lucene3930/lucene/ivy/ivy-LICENSE-FAKE.txt [licenses] Scanned 1 JAR file(s) for licenses (in 0.31s.), 1 error(s). BUILD FAILED /home/hossman/lucene/branch_lucene3930/build.xml:42: The following error occurred while executing this line: /home/hossman/lucene/branch_lucene3930/lucene/build.xml:178: The following error occurred while executing this line: /home/hossman/lucene/branch_lucene3930/lucene/tools/custom-tasks.xml:22: License check failed. Check the logs. Total time: 5 seconds {noformat} > nuke jars from source tree and use ivy > -- > > Key: LUCENE-3930 > URL: https://issues.apache.org/jira/browse/LUCENE-3930 > Project: Lucene - Java > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.6 > > Attachments: LUCENE-3930-solr-example.patch, > LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, > LUCENE-3930.patch > > > As mentioned on the ML thread: "switch jars to ivy mechanism?". -- This message is automatically ge
[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml
[ https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240847#comment-13240847 ] Hoss Man commented on SOLR-2724: Whatever happens with this issue, please note SOLR-3292 and commit r1306642 that was needed on the 3x branch to keep all aspects of the example working. (i didn't make changes on trunk for SOLR-3292 since miller had already reverted SOLR-2724 there, SOLR-3292's changes should be included if/when SOLR-2724 is re-applied) > Deprecate defaultSearchField and defaultOperator defined in schema.xml > -- > > Key: SOLR-2724 > URL: https://issues.apache.org/jira/browse/SOLR-2724 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis, search >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: > SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > I've always been surprised to see the element and > defined in the schema.xml file since > the first time I saw them. They just seem out of place to me since they are > more query parser related than schema related. But not only are they > misplaced, I feel they shouldn't exist. For query parsers, we already have a > "df" parameter that works just fine, and explicit field references. And the > default lucene query operator should stay at OR -- if a particular query > wants different behavior then use q.op or simply use "OR". > Seems like something better placed in solrconfig.xml than in the > schema. > In my opinion, defaultSearchField and defaultOperator configuration elements > should be deprecated in Solr 3.x and removed in Solr 4. And > should move to solrconfig.xml. I am willing to do it, provided there is > consensus on it of course. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3287) 3x tutorial tries to demo schema features that don't work with 3x schema
[ https://issues.apache.org/jira/browse/SOLR-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240092#comment-13240092 ] Hoss Man commented on SOLR-3287: I don't have a "great" suggestion for dealing with this. Fundementally it comes down to a conflict between trying to make the field types used by the example fields general and generic enough to be useful for any languages so people can re-use them, vs having fields in the example that let us show off some features that aren't neccessarily things all users will want in all of their text fields if they copy the schema. we could use copyField to create "_en" versions of all these fields, but this type of solution has also lead to confusion/problems in the past, with people leaving those copyFields in the shchema.xml when they copy it, and winding up with indexes that are twice as big as they need to be. My best suggestions are: * For the search links in #1:: ** leave the verbage as is, but maybe put this line in bold: *Go ahead and edit the schema.xml under the solr/example/solr/conf directory, and change the type for fields text and features from text_general to text_en_splitting* ... i would also suggest changing it to: *Go ahead and edit the schema.xml under the solr/example/solr/conf directory to use type="text_en_splitting" for the fields "text" and "features"* ** include a box showing an example of what the declarations will look like in XML if the user makes these changes ** i think we should also change the example queries so they aren't actually links -- just show the query syntax. my thinking being that this will act as a metnal cue that these are examples of valid queries, but they don't work "out of the box" * For the analysis.jsp link in #2: i think we should switch from using the "name=name" and "name=text" params to using "type=text_en" (with a tweak in verbage to make it clear what the URLs are showing) so these work even if the user doesn't edit the schema. Anyone have any better ideas? > 3x tutorial tries to demo schema features that don't work with 3x schema > > > Key: SOLR-3287 > URL: https://issues.apache.org/jira/browse/SOLR-3287 > Project: Solr > Issue Type: Bug >Reporter: Hoss Man >Priority: Blocker > Fix For: 3.6 > > > I just audited the tutorial on the 3x branch to ensure everything would work > for the 3.6 release, and ran into a two sections where things were very > confusing and seemed broken to me (even as a solr expert) > https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/core/src/java/doc-files/tutorial.html > 1) "Text Analysis" of the 5 queries in this section, only the "pixima" > example works (power-shot matches documents but not the ones the tutorial > suggests it should, and for different reasons). The lead in para does > explain that you have to edit your schema.xml in order for these links to > work -- but it's confusing, and i honestly read it 3 times before i realized > what it was saying (the first two times i thought it was saying that > _because_ the content is in english, english specific field types are used, > and you can change those to text_general if you don't use english) > Bottom line: the links are confusing since they don't work "out of the box" > with the simple commands shown so far > {panel} > If you know your textual content is English, as is the case for the example > documents in this tutorial, and you'd like to apply English-specific stemming > and stop word removal, as well as split compound words, you can use the > text_en_splitting fieldType instead. Go ahead and edit the schema.xml under > the solr/example/solr/conf directory, and change the type for fields text and > features from text_general to text_en_splitting. Restart the server and then > re-post all of the documents, and then these queries will show the > English-specific transformations: > * A search for power-shot matches PowerShot, and adata matches A-DATA due to > the use of WordDelimiterFilter and LowerCaseFilter. > * A search for features:recharging matches Rechargeable due to stemming with > the EnglishPorterFilter. > * A search for "1 gigabyte" matches things with GB, and the misspelled pixima > matches Pixma due to use of a SynonymFilter. > {panel} > * http://localhost:8983/solr/select/?indent=on&q=power-shot&fl=name > * http://localhost:8983/solr/select/?indent=on&q=adata&fl=name > * > http://localhost:8983/solr/select/?indent=on&q=features:recharging&fl=name,features > * http://localhost:8983/solr/select/?indent=on&q=%221%20gigabyte%22&fl=name > * http://localhost:8983/solr/select/?indent=on&q=pixima&fl=name > 2) "Analysis Debugging" > Likewise, all of the analysis.jsp example URLs attempt to sh
[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter
[ https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239605#comment-13239605 ] Hoss Man commented on SOLR-435: --- bq. Why not simply apply the SOLR-2001 patch for consistent behavior? good question ... if you're cool with that then it seems okay to me (although off the top of my head i think when i was looking at trunk one of those 4 "main" parsers still needed "fixed" to return null). in general my suggestion for 3.6 was largely based on the fact that there was still active discussion about what the best long term behavior was, which might contradict what was discussed in SOLR-2001, so better to play it safe and just clean up the error reporting: "I'd rather leave things the way they are then make a bad decision in a hurry" if you want to backport SOLR-2001 and sanity check that lucene/dismax/edismax/lucenePlusSort all return null on null/blank query strings i'm +1 to that (that seems consistent with what ryan/male/me were advocating here, so as long as your on board i think we're good) > QParser must validate existence/absence of "q" parameter > > > Key: SOLR-435 > URL: https://issues.apache.org/jira/browse/SOLR-435 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 >Reporter: Ryan McKinley >Assignee: David Smiley > Fix For: 3.6, 4.0 > > Attachments: SOLR-435.patch, SOLR-435_3x_consistent_errors.patch, > SOLR-435_q_defaults_to_all-docs.patch > > > Each QParser should check if "q" exists or not. For some it will be required > others not. > currently it throws a null pointer: > {code} > java.lang.NullPointerException > at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) > at > org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104) > at org.apache.solr.search.QParser.getQuery(QParser.java:80) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67) > at > org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150) > ... > {code} > see: > http://www.nabble.com/query-parsing-error-to14124285.html#a14140108 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter
[ https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239087#comment-13239087 ] Hoss Man commented on SOLR-435: --- bq. I agree but I also think we should commit the improved error message suggested by David so that we avoid the unhelpful NPE. Any broader changes will be in 4.0 so we don't have a backwards compat problem. Grrr... yes, i see ... SOLR-2001 is only on trunk, somehow i overlooked that and it contributed to my confusion as to some of the comments in this issue. So instead of NPEs or what not that you get in 3.5 from various parsers, we switch to consistent'new ParseException("missing query string");' in 3.6, and address if there can be better default handling in 4.0 (continuing what SOLR-2001 started) +1 > QParser must validate existence/absence of "q" parameter > > > Key: SOLR-435 > URL: https://issues.apache.org/jira/browse/SOLR-435 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 >Reporter: Ryan McKinley >Assignee: David Smiley > Fix For: 3.6, 4.0 > > Attachments: SOLR-435.patch, SOLR-435_q_defaults_to_all-docs.patch > > > Each QParser should check if "q" exists or not. For some it will be required > others not. > currently it throws a null pointer: > {code} > java.lang.NullPointerException > at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) > at > org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104) > at org.apache.solr.search.QParser.getQuery(QParser.java:80) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67) > at > org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150) > ... > {code} > see: > http://www.nabble.com/query-parsing-error-to14124285.html#a14140108 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-435) QParser must validate existence/absence of "q" parameter
[ https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239023#comment-13239023 ] Hoss Man commented on SOLR-435: --- bq. Again I agree. But I'm just not sure if that validation / error checking should involve checking alternative parameters. That feels like its defeating the goal of QParsers working in all situations. Not sure i see the problem, ... part of the advantage in how q.alt it's implemented now is that you can put things like... {noformat} !dismax q.alt=*:* v=$keywords} {noformat} ...into "appends" params in your solrconfig. by default nothing is filtered, but if the client provides a "keywords" param then it's used. bq. I just also wonder whether down the line we want better error messages here too. David's suggestion for "missing query string" aligns with other such messages It wouldn't have to ... parse() can throw ParseExceptions and QueryCOmponent (or whatever delegated to the QParser can wrap that in a user error (QueryCOmponent already does that) > QParser must validate existence/absence of "q" parameter > > > Key: SOLR-435 > URL: https://issues.apache.org/jira/browse/SOLR-435 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 >Reporter: Ryan McKinley >Assignee: David Smiley > Fix For: 3.6, 4.0 > > Attachments: SOLR-435_q_defaults_to_all-docs.patch > > > Each QParser should check if "q" exists or not. For some it will be required > others not. > currently it throws a null pointer: > {code} > java.lang.NullPointerException > at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) > at > org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104) > at org.apache.solr.search.QParser.getQuery(QParser.java:80) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67) > at > org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150) > ... > {code} > see: > http://www.nabble.com/query-parsing-error-to14124285.html#a14140108 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins
[ https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238710#comment-13238710 ] Hoss Man commented on SOLR-3268: i think you're talking about SOLR-3112 which was dealt with .. but even if there are others, we can start by adding this check now, and then file issues to fix & remove whatever is left. this isn't a silver bullet, it's certainly not as good as actually looking down src to fail on writes, but it will at least force people to be aware if/when they add a test that pollutes src/ > remove write acess to source tree (chmod 555) when running tests in jenkins > --- > > Key: SOLR-3268 > URL: https://issues.apache.org/jira/browse/SOLR-3268 > Project: Solr > Issue Type: Bug >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: SOLR-3268_sync.patch > > > Some tests are currently creating files under the source tree. > This causes a lot of problems, it makes my checkout look dirty after running > 'ant test' and i have to cleanup. > I opened and issue for this a month in a half for > solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), > but now we have a second file > (core/src/test-files/solr/conf/elevate-data-distrib.xml). > So I think hudson needs to chmod these src directories to 555, so that solr > tests that do this will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins
[ https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238685#comment-13238685 ] Hoss Man commented on SOLR-3268: if locking down src/ so tests can't make changes is tricky to do safely, perhaps a we could do something simpler to get us part way towards the ultimate goal? ... add a final step to the jenkins build script that fails if "svn status | wc -l" returns non-zero? it wont't ensure no changes are made to src/, but it should ensure no changes are made to src/ unless explicitly allowed by an svn:ignore ... then we just have to (remove any existing svn:ignore under /src and) make sure we publicly shame anyone who adds svn:ignores to src/ because they wrote a sloppy test. > remove write acess to source tree (chmod 555) when running tests in jenkins > --- > > Key: SOLR-3268 > URL: https://issues.apache.org/jira/browse/SOLR-3268 > Project: Solr > Issue Type: Bug >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: SOLR-3268_sync.patch > > > Some tests are currently creating files under the source tree. > This causes a lot of problems, it makes my checkout look dirty after running > 'ant test' and i have to cleanup. > I opened and issue for this a month in a half for > solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), > but now we have a second file > (core/src/test-files/solr/conf/elevate-data-distrib.xml). > So I think hudson needs to chmod these src directories to 555, so that solr > tests that do this will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-435) QParser must validate existance/absense of "q" parameter
[ https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238538#comment-13238538 ] Hoss Man commented on SOLR-435: --- bq. If the purpose of the QueryComponent is to be QParser agnostic and consequently unable to know if the 'q' parameter is even relevant, shouldn't it be up to the QParser to retrieve what it believes the query string to be from the request parameters? Sorry ... i chose my words carelessly and wound up saying almost the exact opposite of what i ment. What i should have said... * QueryComponent is responsible for determining the QParser to use for the main query and passing it the value of the q query-string param to the QParser.getParser(...) method * QParser.getParser passes that query-string on to whater QParserPlugin was selected as the "qstr" param to the createParser * The QParser that gets created by the createParser call should do whatever validation it needs to do (including a null check) in it's parse() method In answer to your questions... * QueryComponent can not do any validation of the q param, because it can't make any assumptions about what the defType QParser this are legal values -- not even a null check, because in case of things like dismax nll is perfectly fine * QParsers (and QParserPlugins) can't be made responsible for fetching the q param because they don't know if/when they are being used to parse the main query param, vs fq params, vs some other nested subquery * by putting this kind of validation/error checking in the QParser.parse method, we ensure that it is used properly even when the QParser(s) are used for things like 'fq' params or in nested subqueries bq. Hoss: I don't agree with your reasoning on the developer-user typo-ing the 'q' parameter. If you mistype basically any parameter then clearly it is as if you didn't even specify that parameter and you get the default behavior of the parameter you were trying to type correctly but didn't. understood ... but most other situations the "default" behavior is either "do nothing" or "error" ... we don't have a lot of default behaviors which are "give me tones of stuff" ... if you use {{facet=true&faceet.field=foo}} (note the extra character) you don't silently get get faceting on every field as a default -- you get no field faceting at all. if you misstype the q param name and get an error on your first attempt you immediately understand you did something wrong. likewise if we made the default a "matches nothing" query, then you'd get no results and (hopefully) be suspicious enough to realize you made a mistake -- but if we give you a bunch of results by default you may not realize at all that you're looking at all results not just the results of what you thought the query was. the only situations i can think of where forgetting or mistyping a param name doens't default to error or nothing are things with fixed expectations: start, rows, fl, etc... Those have defaults that (if they don't match what you tried to specify) are immediately obvious ... the 'start' attribute on the docList returned is wrong, you get more results then you expected, you get field names you know you didn't specify, etc... it's less obvious when you are looking at the results of a query that it's a match-all query instead of the query you thought you were specifying. like i said ... i'm -0 to having a hardcoded default query for lucene/dismax/edismax ... if you feel strongly about it that's fine, allthough i would try to convince you "match none" is a better hardcoded default then 'match all' (so that it's easier to recognize mistakes quickly) and really don't think we should do it w/o also add q.alt support to the LuceneQParser so people can override it. > QParser must validate existance/absense of "q" parameter > > > Key: SOLR-435 > URL: https://issues.apache.org/jira/browse/SOLR-435 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 3.6, 4.0 > > Attachments: SOLR-435_q_defaults_to_all-docs.patch > > > Each QParser should check if "q" exists or not. For some it will be required > others not. > currently it throws a null pointer: > {code} > java.lang.NullPointerException > at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) > at > org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104) > at org.apache.solr.search.QParser.getQuery(QParser.java:80) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67) > at > org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150) >
[jira] [Commented] (SOLR-435) QParser must validate existance/absense of "q" parameter
[ https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237366#comment-13237366 ] Hoss Man commented on SOLR-435: --- bq. if no query string is supplied, or if its blank or just whitespace, then the default is to match all documents. -0 ... the risk with this approach is that (new) users who make typos in queries or are missinformed about the name "q" param (ie: {{/search?Q=foo}} or {{/search?query=foo}}) will be really confused when they query they specify is completely ignored w/o explanation and all docs are returned in it's place. I think it's much better to throw a clear error "q param is not specified" but i certainly see the value in adding q.alt support to the LuceneQParser with the same semantics as dismax (used if q is missing or all whitespace) .. not sure why we've never considered that before. (obviosly it wouldn't make sense for all QParsers, like "field" or "term" since all whitespace and or empty strings are totally valid input for them) bq. I could have modified QueryComponent, or just QParser, or just the actual QParser subclasses. A universal choice couldn't be made for all qparsers... we definitely shouldn't modify QueryComponent ... the entire point of the issue is that QueryComponent can't attempt to validate the q param, because it doesn't know if/when the defType QParser requires it to exist -- the individual QParsers all need to throw clear errors if they require it and it's not specified, that's really the whole reason this issue was opened in the first place > QParser must validate existance/absense of "q" parameter > > > Key: SOLR-435 > URL: https://issues.apache.org/jira/browse/SOLR-435 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 3.6, 4.0 > > Attachments: SOLR-435_q_defaults_to_all-docs.patch > > > Each QParser should check if "q" exists or not. For some it will be required > others not. > currently it throws a null pointer: > {code} > java.lang.NullPointerException > at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) > at > org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104) > at org.apache.solr.search.QParser.getQuery(QParser.java:80) > at > org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67) > at > org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150) > ... > {code} > see: > http://www.nabble.com/query-parsing-error-to14124285.html#a14140108 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3255) OpenExchangeRates.Org Exchange Rate Provider
[ https://issues.apache.org/jira/browse/SOLR-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236260#comment-13236260 ] Hoss Man commented on SOLR-3255: looks pretty cool, but that listAvailableCurrencies smells kind of fishy in general, and with this patch smells even fishier (depending on the arg, it either returs a list of string codes, or a list of string code perumtations with a comma separator?) If we're seeing now, with multiple Provider impls, that the API doens't make sense -- we should bite the bullet and change it before it's public. perhaps two methods: listAvailableCurrencies() that returns a Set and listAvailableConversions that returns Map ? > OpenExchangeRates.Org Exchange Rate Provider > > > Key: SOLR-3255 > URL: https://issues.apache.org/jira/browse/SOLR-3255 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 3.6, 4.0 >Reporter: Jan Høydahl >Assignee: Jan Høydahl > Labels: CurrencyField > Fix For: 3.6, 4.0 > > Attachments: SOLR-3255.patch, SOLR-3255.patch, SOLR-3255.patch > > > An exchange rate provider for CurrencyField using the freely available feed > from http://openexchangerates.org/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3127) Dismax to honor the KeywordTokenizerFactory when querying with multi word strings
[ https://issues.apache.org/jira/browse/SOLR-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235940#comment-13235940 ] Hoss Man commented on SOLR-3127: whitespace is a significant meta character to dismax (and for that matter, the main lucene QUeryParser as well) ... it indicates the seperation betwen optional clauses. the query parsing structure is independent of the analyzer used, so the fact that a KeywordTokenizerFactory is used on the field in question is irrelevant, you might have another qf that doens't have KeywordTokenizerFactory so even if dismax tried to guess that it should treat the entire nput as all one string, it couldn't do that for other fields. if you wnat your entire input to be treated as a literal, without treating whitespace as a meta-character, it needs to be quoted, or consider using an alternative parser (ie: the "field" QParser is designed for this type of "i want to query a single field for a specific value" type situation. > Dismax to honor the KeywordTokenizerFactory when querying with multi word > strings > - > > Key: SOLR-3127 > URL: https://issues.apache.org/jira/browse/SOLR-3127 > Project: Solr > Issue Type: Improvement > Components: query parsers >Affects Versions: 3.5 >Reporter: Zac Smith >Priority: Minor > Labels: dismax > > When using the KeywordTokenizerFactory with a multi word search string, the > dismax query created is not very useful. Although the query analzyer doesn't > tokenize the search input, each word of the input is include in the search. > e.g. if searching for 'chicken stock' the dismax query created would be: > +(DisjunctionMaxQuery((ingredient_synonyms:chicken^0.6)~0.01) > DisjunctionMaxQuery((ingredient_synonyms:stock^0.6)~0.01)) > DisjunctionMaxQuery((ingredient_synonyms:chicken stock^0.6)~0.01) > Note that although the query analyzer does not tokenize the term 'chicken > stock' into 'chicken' and 'stock', they are still included and required in > the search term. > I think the query created should be just: > DisjunctionMaxQuery((ingredient_synonyms:chicken stock)~0.01) > (or at least not have the individual terms as should match, not must match so > you could configure with MM. > Example field type: > positionIncrementGap="100" autoGeneratePhraseQueries="false"> > > > > > > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3259) Solr 4 aesthetics
[ https://issues.apache.org/jira/browse/SOLR-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234012#comment-13234012 ] Hoss Man commented on SOLR-3259: bq. the concept of an "example" server that you must configure yourself has become less than ideal... perhaps we should just create a "server" directory (but leave things like exampledocs under example) bq. Would also be nice to remove the "multicore" directory in there (since the normal server is already multi-core enabled). Of course if we moved just the essential parts to "server", then "multicore", "example-DIH" and "exampledocs" would all be left behind in "example", as they should be. once upon a time, i argued heavily that we should renamed "example" to "examples" and have many more of them ("minimal", "tech-products-from-the-90s", "books", "kitchen-sink", "multicore", "dih", etc...) .. and the only reason i never pushed harder for this was because that kind of directory structure would have made running the "main" example (whatever we might have called it) much harder then "java -jar start.jar" because it would have required specifying solr.solr.home. Thinking about it some more now that you've brought this issue up, it occurs to me that in that intervening time, multicore solr setups have been come the norm, not the exception ... so i'm now much more on board the idea of having a single example setup -- even calling it "server" if people think that's a good idea -- provided we move all of the various examples we have into that example setup as multiple cores. they don't have to all be enabled, they don't even have to be commented out in solr.xml, it would just be nice if they lived in the same directory and could easily be added as new cores with a simple CREATE commands a relative paths. So by default if you did "cd server && java -jar start.jar" you might only get one collection (or maybe two if we also want to show off a "minimal" collection) but the docs for features like DIH might say "to see an example of this, hit the following URL to create a new collection using the DIH configs: http://localhost...CREATE"; bq. If anything I'd vote for making the distro closer to what people would want in production. You could then have a pure "solr/jetty" folder with ONLY jetty, a "solr/example-home" folder which holds todays "example/solr" ... start-solr.[cmd|sh], which copies the war from dist to jetty/webapps, sets -Dsolr.solr.home and starts Jetty while i like the idea of creating clearer/cleaner separation between what files are "jetty" and what files are "solr" and what files are "config" i'm not a fan of your specific suggestions here because it moves away from the really clean simplicity of "cd something && java -jar start.jar" -- which make it *really* trivial for anyone anywhere to try solr out regardless of their os, or what weird eccentricities are in their shell, or whether the file perms on some scripts are correct, etc... if we want to start shipping init.d scripts and what not for "production usage" (something we typically avoided in the past because there are so many different ways people like to do these things, not to mention that many people like to use tomcat or some other servlet container) then that seems like it could/should really be distinct from how people run the example for the tutorial ... it shouldn't be completely orthogonal, we should be able to say something like: "if you copy this ./server/ directory to some place on your production server, this directory of ./service-tools/ can be used to automatically start/stop when your machine comes up or down, just configure the path where you copied ./server/" ... but people shouldn't have to use some start.sh just to try out the tutorial not compared to how easy "java -jar start.jar" is today. > Solr 4 aesthetics > - > > Key: SOLR-3259 > URL: https://issues.apache.org/jira/browse/SOLR-3259 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > Fix For: 4.0 > > > Solr 4 will be a huge new release... we should take this opportunity to > improve the out-of-the-box experience. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3207) Add field name validation
[ https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230689#comment-13230689 ] Hoss Man commented on SOLR-3207: the giant elephant in the room that doesn't seem to have been discussed is that trying to validate that field names meet some strict criteria when loading schema.xml doesn't really address dynamic fields -- the patch ensures that configurations have names which are validated, but i don't see anything that would considering the actually field names people use with those dynamic fields -- ie: "*_i" might be a perfectly valid dynamicField at startup, but that startup validation isn't going to help me if i index a document containing the field name "{{$ - foo_i}}" In general, i'm opposed to the idea of "locking down" what field names can be used across the board. My preference would be to let people us any field name their heart desires, but provide better documentation on what field name restrictions exist on which features and provide (ie: "using a field name in function requires that the field name match ValidatorX; using a field name in fl requires can only be used with field names conform to ValidatorX and ValidatorY; etc..."). If we want to provide automated "validation" of these things for people, then let's make it part of the LukeRequestHandler: for any field name returned by the LukeRequestHandler, let's include a warnings section advising them which validation rules that field name doesn't match, and what features depend on that validation rule -- this info could then easily be exposed in the admin UI. We could also provide an optional UpdateProcessor people could configure with a list of individual Validators which could reject any document containing a field that didn't match the validator (or optionally: translate the field name to something thta did conform) to help people enforce these even on dynamic fields. So by default: any field is allowed, but if i create one with a funky name (either explicitly or as a result of loading data using a dynamicField) the admin UI starts warning me that feature XYZ won't work with fields A, B, C; and if i want to ensure feature D works will all of my fields i add an update processor to ensure it. > Add field name validation > - > > Key: SOLR-3207 > URL: https://issues.apache.org/jira/browse/SOLR-3207 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.0 >Reporter: Luca Cavanna > Fix For: 4.0 > > Attachments: SOLR-3207.patch > > > Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would > be useful to add some kind of validation regarding the field names you can > use on Solr. > The objective would be adding consistency, allowing only field names that you > can then use within fl, sorting etc. > The rules, taken from the actual StrParser behaviour, seem to be the > following: > - same used for java identifiers (Character#isJavaIdentifierPart), plus the > use of trailing '.' and '-' > - for the first character the rule is Character#isJavaIdentifierStart minus > '$' (The dash can't be used as first character (SOLR-3191) for example) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230659#comment-13230659 ] Hoss Man commented on SOLR-3251: a low level implementation detail i would worry about is "snapshoting" the schema for the duration of a single request .. i suspect there are more then a few places in solr that would generate weird exceptions if multiple calls to "req.getSchema().getFields()" returned different things in the middle of processing a single request. > dynamically add field to schema > --- > > Key: SOLR-3251 > URL: https://issues.apache.org/jira/browse/SOLR-3251 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > Attachments: SOLR-3251.patch > > > One related piece of functionality needed for SOLR-3250 is the ability to > dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3232) Admin UI: query form should have a menu to pick a request handler
[ https://issues.apache.org/jira/browse/SOLR-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230656#comment-13230656 ] Hoss Man commented on SOLR-3232: bq. Anyway, I'll let it go, and just roll my eyes at all the hacks and duplication that building an entirely Ajax UI using pure JSON responses entails. I've said it before and i'll say it again: the single most important reason why i think the javascript based Admin UI is a great idea is becuase it *forces* us to make sure all of the info needed to build the admin UI is available via HTTP using request handlers and what no -- ensuring that we think about how other clients can programmatically access the same information. the old JSPs and the velocity engine generated pages had too much direct access to internals, making it too easy to overlook when external clients didn't have access to useful data. bq. How about SolrInfoMBeanHandler.java adds a simple "searchHandler" attribute with true/false? I think it would be just as easy and far more generally useful add a request param to SolrInfoMBeanHandler that would let you filter the objects by what class's they are an instance of just like it can filter by "cat" and "key" right now (ie: "/admin/mbeans?class=solr.SearchHandler"). -- As far as this issue in general: i think it's a good idea to add a pulldown to make it more friendly to folks and easier to use in the common case, and populating the pulldown with all the instances of SerachHandler makes a lot of sense, but we should try to use some UI element that will allows people to type in their own handler name if they want (ie: http://jsfiddle.net/6QeXU/3/ but i'm sure thers a clearner more efficient way to do it) so we don't anoy people who have their own custom RequestHandlers that don't subclass SerachHandler, or want to use things like MoreLikeThisHandler, etc...) (Longer term, it would be to make querying the AdminHandler return all sorts of useful introspection info about what is currently running to drive the UI screen generation, with optional config on the handler to override things maybe i don't wnat to advertise some search handler instance?) along the lines of this brainstorming doc i write a long, long time ago: http://wiki.apache.org/solr/MakeSolrMoreSelfService#Request_Handler_Param_Docs) > Admin UI: query form should have a menu to pick a request handler > - > > Key: SOLR-3232 > URL: https://issues.apache.org/jira/browse/SOLR-3232 > Project: Solr > Issue Type: Improvement > Components: web gui >Reporter: David Smiley >Assignee: Stefan Matheis (steffkes) > Fix For: 4.0 > > Attachments: SOLR-3232.patch > > > The query form in the admin UI could use an improvement regarding how the > request handler is chosen; presently all there is is a text input for 'qt'. > The first choice to make in the form above the query should really be the > request handler since it actually handles the request before any other > parameters do anything. It'd be great if it was a dynamically driven menu > defaulting to "/select". Similar to how the DIH page finds DIH request > handlers, this page could find the request handlers with a class of > "SearchHandler". Their names would be added to a list, and if the name > didn't start with a '/' then it would be prefixed with '/select?qt='. > I did something similar (without the menu) to the old 3x UI in a patch to > SOLR-3161 which will hopefully get committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230627#comment-13230627 ] Hoss Man commented on SOLR-3251: bq. Any ideas for an external API? I think the best way to support this externally is using the existing mechanism for plugins... * a RequestHandler people can register (if they want to support external clients programaticly modifying the schema) that accepts ContentStreams containing whatever payload structure makes sense given the functionality. * an UpdateProcessor people can register (if they want to support stuff like SOLR-3250 where clients adding documents can submit any field name and a type is added based on the type of hte value) which could be configured with mappings of java types to fieldTypes and rules about other field attributes -- ie "if a client submits a new field=value with a java.lang.Integer value, create a new "tint" field with that name and set stored=true. > dynamically add field to schema > --- > > Key: SOLR-3251 > URL: https://issues.apache.org/jira/browse/SOLR-3251 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > Attachments: SOLR-3251.patch > > > One related piece of functionality needed for SOLR-3250 is the ability to > dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms
[ https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229298#comment-13229298 ] Hoss Man commented on SOLR-3241: patch looks fine ... i wish there was a way to make it easier for poly fields so they wouldn't have do do the check themselves, but when i tried the idea i had it didn't work, so better to go with this for now and maybe refactor a helper method later. the few changes i would make: 1) make the new tests grab the IndexSchema obejct and assert that every field (that the cares about) has the expected omitNorms value -- future proof ourselves against someone nuetering the test w/o realizing by tweaking the test schema because they don't know that there is a specific reason for those omitNorm settings 2) add a test that explicitly verifies the failure case of someone setting field boost on a field with omitNorms==true, assert that we get the expected error mesg (doesn't look like this was added when LUCENE-3796 was commited, and we want to make sure we don't inadvertantly break that error check) > Document boost fail if a field copy omit the norms > -- > > Key: SOLR-3241 > URL: https://issues.apache.org/jira/browse/SOLR-3241 > Project: Solr > Issue Type: Bug >Reporter: Tomás Fernández Löbbe > Fix For: 3.6, 4.0 > > Attachments: SOLR-3241.patch, SOLR-3241.patch, SOLR-3241.patch, > SOLR-3241.patch > > > After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible > to set a boost to a field that has the "omitNorms" set to true. This is > making Solr's document index-time boost to fail when a field that doesn't > omit norms is copied (with copyField) to a field that does omit them and > document boost is used. For example: > omitNorms="false"/> > omitNorms="true"/> > > I'm attaching a possible fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms
[ https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228924#comment-13228924 ] Hoss Man commented on SOLR-3241: bq. The reason the logic was somewhat complicated in DocumentBuilder is because, from the lucene indexer its easy to detect this case, but: sure ... but i think it's not actually just "Document Boost" is it? if field "foo" is declared with omitNorms==false, and a client sends a doc with a field "foo" using a fieldBoost then that should be totally fine -- but if the schema says to copyField from foo->bar where bar has omitNorms==true then i think that will currently cause an from the lucene low level check, corret? (i haven't tried it, i'm going based on tomas's path) likewise if "foo" is a LatLonField (or any other polyfield) and the underlying dynamic field used has omitNorms==true then won't that same low level lucene code throw an error there? so multiple error paths from totally sane usage none of which has anything to do with doc boost, right? (Truth be told, i didn't even notice the "Document boost" part of the summary, i was just looking at tomas's patch and skimming the summary) > Document boost fail if a field copy omit the norms > -- > > Key: SOLR-3241 > URL: https://issues.apache.org/jira/browse/SOLR-3241 > Project: Solr > Issue Type: Bug >Reporter: Tomás Fernández Löbbe > Fix For: 3.6, 4.0 > > Attachments: SOLR-3241.patch > > > After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible > to set a boost to a field that has the "omitNorms" set to true. This is > making Solr's document index-time boost to fail when a field that doesn't > omit norms is copied (with copyField) to a field that does omit them and > document boost is used. For example: > omitNorms="false"/> > omitNorms="true"/> > > I'm attaching a possible fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3241) Document boost fail if a field copy omit the norms
[ https://issues.apache.org/jira/browse/SOLR-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228914#comment-13228914 ] Hoss Man commented on SOLR-3241: part of me things we should just remove the error checking for {{omitNorms && boost != 1.0F}} from DocumentBuilder.toDocument (added in /LUCENE-3796) and just silently ignore any boost on a SolrInputField where omitNorms==true (ie: maybe log a warning, but don't throw an Exception). This would be consistent with the behavior in past releases (except for the warning log if we add that), and wouldn't cause any confusing errors for things like LatLonType (even if they come from third-party plugins we can't contro/test) On the other hand... that feels really dirty, and it would be nice to fail fast and loud if the client tries to set a boost on an omitNorms field Perhaps a better fix would be to leave DocumentBuilder exactly as it is today, and instead change FieldType.createField to (silently) ignore the boost if omitNorms==true for that SchemaField. if i'm thinking about this right, that would mean the error checking of the SolrInputDocument (and all it's SolrInputFields) in DocumentBuilder.toDocument would still work as designed -- so you'd get an error if any client or "high level" plugin like an UpdateProcessor tried to use a field boost on an omitNorms field; but any fields added at a lower level (ie: by copyField or a poly field) would silently ignore those boosts if they were copied/cloned to a field where omitNorms==true. > Document boost fail if a field copy omit the norms > -- > > Key: SOLR-3241 > URL: https://issues.apache.org/jira/browse/SOLR-3241 > Project: Solr > Issue Type: Bug >Reporter: Tomás Fernández Löbbe > Fix For: 4.0 > > Attachments: SOLR-3241.patch > > > After https://issues.apache.org/jira/browse/LUCENE-3796, it is not possible > to set a boost to a field that has the "omitNorms" set to true. This is > making Solr's document index-time boost to fail when a field that doesn't > omit norms is copied (with copyField) to a field that does omit them and > document boost is used. For example: > omitNorms="false"/> > omitNorms="true"/> > > I'm attaching a possible fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3218) Range faceting support for CurrencyField
[ https://issues.apache.org/jira/browse/SOLR-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228644#comment-13228644 ] Hoss Man commented on SOLR-3218: bq. I updated CurrencyValue.toString() to return "3.14,USD" for $3.14 rather than "314,USD". My feeling is that it's more straight forward to return strings that look like the values that were passed in to parse(). that sounds right. the most important thing is that in the response from range faceting, where it gives you a (str) lower bound about a count, that lower bound should be a legal value when building a query against that field (ie: you can use it in a range query) ... i'm pretty sure (if i understand correctly) that for CurrencyField that means "3.14,USD" bq. I worry that relaxing the restriction on the gap may just be confusing without adding any real value. We may want to consider forcing gap to be the same as start and end so that things are more conceptually straight forward. I believe you -- i've got no objection to locking that down, i just want to make sure that if we doc "you can't do this" that: a) the code actually fails if you try; and b) we have a test proving that the code will fail if you try. (and if we decide later that it makes sense, we can relax things and change the test & docs) > Range faceting support for CurrencyField > > > Key: SOLR-3218 > URL: https://issues.apache.org/jira/browse/SOLR-3218 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Fix For: 4.0 > > Attachments: SOLR-3218-1.patch, SOLR-3218-2.patch, SOLR-3218.patch, > SOLR-3218.patch > > > Spinoff from SOLR-2202. Need to add range faceting capabilities for > CurrencyField -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226349#comment-13226349 ] Hoss Man commented on SOLR-3159: bq. Wierdly, I don't see an annoucement for v8.1.2 Sorry .. poor wording on my part: the issue is marked fixed in 8.1.2 but the jetty Jira system lists 8.1.2 as unreleased (ie: fixed on jetty's 8.1 branch for hte next release i guess) --- Another little glitch i just noticed is that aparently with the new jetty configs JSP support isn't enabled? loading http://localhost:8983/solr/ works fine, but http://localhost:8983/solr/admin/dataimport.jsp gives a 500 error "JSP support not configured" > Upgrade to Jetty 8 > -- > > Key: SOLR-3159 > URL: https://issues.apache.org/jira/browse/SOLR-3159 > Project: Solr > Issue Type: Task >Reporter: Ryan McKinley >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3159-maven.patch > > > Solr is currently tested (and bundled) with a patched jetty-6 version. > Ideally we can release and test with a standard version. > Jetty-6 (at codehaus) is just maintenance now. New development and > improvements are now hosted at eclipse. Assuming performance is equivalent, > I think we should switch to Jetty 8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225675#comment-13225675 ] Hoss Man commented on SOLR-3159: https://jira.codehaus.org/browse/JETTY-1489 - apparently fixed in 8.1.2 ? > Upgrade to Jetty 8 > -- > > Key: SOLR-3159 > URL: https://issues.apache.org/jira/browse/SOLR-3159 > Project: Solr > Issue Type: Task >Reporter: Ryan McKinley >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3159-maven.patch > > > Solr is currently tested (and bundled) with a patched jetty-6 version. > Ideally we can release and test with a standard version. > Jetty-6 (at codehaus) is just maintenance now. New development and > improvements are now hosted at eclipse. Assuming performance is equivalent, > I think we should switch to Jetty 8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3159) Upgrade to Jetty 8
[ https://issues.apache.org/jira/browse/SOLR-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225674#comment-13225674 ] Hoss Man commented on SOLR-3159: FWIW: on trunk, using an svn checkout, and runing "java -jar start.jar" i'm getting the following error in the jetty logging after solr starts up... {noformat} 2012-03-08 15:16:09.382:WARN:oejw.WebAppContext:Failed startup of context o.e.j.w.WebAppContext{/.svn,file:/home/hossman/lucene/dev/solr/example/webapps/.svn/},/home/hossman/lucene/dev/solr/example/webapps/.svn java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.String.charAt(String.java:686) at org.eclipse.jetty.util.log.StdErrLog.condensePackageString(StdErrLog.java:210) at org.eclipse.jetty.util.log.StdErrLog.(StdErrLog.java:105) at org.eclipse.jetty.util.log.StdErrLog.(StdErrLog.java:97) at org.eclipse.jetty.util.log.StdErrLog.newLogger(StdErrLog.java:569) at org.eclipse.jetty.util.log.AbstractLogger.getLogger(AbstractLogger.java:21) at org.eclipse.jetty.util.log.Log.getLogger(Log.java:438) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:677) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:58) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91) at org.eclipse.jetty.server.Server.doStart(Server.java:263) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215) at java.security.AccessController.doPrivileged(Native Method) at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.eclipse.jetty.start.Main.invokeMain(Main.java:457) at org.eclipse.jetty.start.Main.start(Main.java:602) at org.eclipse.jetty.start.Main.main(Main.java:82) {noformat} ...solr is functioning just fine, but it seems like something has changed subtley in either how jetty handles the webapps dir, or how we have it configured to handle the webapps dir, such that it is trying to load .svn as a webapp. > Upgrade to Jetty 8 > -- > > Key: SOLR-3159 > URL: https://issues.apache.org/jira/browse/SOLR-3159 > Project: Solr > Issue Type: Task >Reporter: Ryan McKinley >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3159-maven.patch > > > Solr is currently tested (and bundled) with a patched jetty-6 version. > Ideally we can release and test with a standard version. > Jetty-6 (at codehaus) is just maintenance now. New development and > improvements are now hosted at eclipse. Assuming per
[jira] [Commented] (SOLR-3218) Range faceting support for CurrencyField
[ https://issues.apache.org/jira/browse/SOLR-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225639#comment-13225639 ] Hoss Man commented on SOLR-3218: bq. I believe start/end currency equality is enforced by MoneyType.compareTo which will throw an exception when end is compared to the first (start+gap). Ah ..ok. and then ultimately start+gap is compared to end (even if hardend is false) so you'll get a exception then. ok fair enough. bq. As far as enforcing currency equality being a good idea or not, it would make sense and I would prefer if start/end/gap currencies didn't need to be equal. This patch doesn't allow for that given the tradeoff of the utility of being able to use different currencies versus the annoyance of keeping a handle open to an ExchangeRateProvider in the places we'd need it. I'm not completley understanding, but i don't need to: If it's easier/simpler (for now) to require that start/end/gap are all in the same currency that's fine -- we should just test/document that clearly .. we can alwasy relax that restriction later if you think of a clean/easy way to do it. like i said before: it's probably silly to do it anyway, i just didn't understand if/what the complication was > Range faceting support for CurrencyField > > > Key: SOLR-3218 > URL: https://issues.apache.org/jira/browse/SOLR-3218 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Fix For: 4.0 > > Attachments: SOLR-3218-1.patch, SOLR-3218-2.patch > > > Spinoff from SOLR-2202. Need to add range faceting capabilities for > CurrencyField -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2202) Money FieldType
[ https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225493#comment-13225493 ] Hoss Man commented on SOLR-2202: a) CurrencyField (and by extension "CurrencyValue") gets my vote b) i really only reviewed the facet stuff in SOLR-2202-solr-10.patch (i know Jan has already been reviewing the more core stuff about the type) ... it makes me realize that we really need to refactor the range faceting code to be easier to do in custom FieldTypes, but that's certainly no fault of this issue and can be done later. The facet code itself looks correct but my one concern is that (if i'm understanding all of this MoneyValue conversion stuff correctly) it _should_ be possible to facet with start/end/gap values specified in any currency, as long as they are all consistent -- but there is not test of this situation. the negative test only looks at using an inconsistent gap, and the positive tests only use USD, or the "default" which is also USD. We should have at least one test that uses something like EUR for start/end/gap and verifies that the counts are correct given the conversion rates used in the test. incidentally: I don't see anything actually enforcing that start/end are in the same currency -- just that gap is in the same currency as the values it's being added to, so essentially that start and gap use hte same currenty. But I'm actually not at all clear on why there is any attempt to enforce that the currencies used are the same, since the whole point of the type (as i understand it) is that you can do conversions on the fly -- it may seem silly for someone to say {{facet.range.start=0,USD & facet.range.gap=200,EUR & facet.range.end=1000,YEN}} but is there any technical reason why we can't let them do that? > Money FieldType > --- > > Key: SOLR-2202 > URL: https://issues.apache.org/jira/browse/SOLR-2202 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.5 >Reporter: Greg Fodor >Assignee: Jan Høydahl > Fix For: 3.6, 4.0 > > Attachments: SOLR-2022-solr-3.patch, SOLR-2202-lucene-1.patch, > SOLR-2202-solr-1.patch, SOLR-2202-solr-10.patch, SOLR-2202-solr-2.patch, > SOLR-2202-solr-4.patch, SOLR-2202-solr-5.patch, SOLR-2202-solr-6.patch, > SOLR-2202-solr-7.patch, SOLR-2202-solr-8.patch, SOLR-2202-solr-9.patch, > SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch > > > Provides support for monetary values to Solr/Lucene with query-time currency > conversion. The following features are supported: > - Point queries > - Range quries > - Sorting > - Currency parsing by either currency code or symbol. > - Symmetric & Asymmetric exchange rates. (Asymmetric exchange rates are > useful if there are fees associated with exchanging the currency.) > At indexing time, money fields can be indexed in a native currency. For > example, if a product on an e-commerce site is listed in Euros, indexing the > price field as "1000,EUR" will index it appropriately. By altering the > currency.xml file, the sorting and querying against Solr can take into > account fluctuations in currency exchange rates without having to re-index > the documents. > The new "money" field type is a polyfield which indexes two fields, one which > contains the amount of the value and another which contains the currency code > or symbol. The currency metadata (names, symbols, codes, and exchange rates) > are expected to be in an xml file which is pointed to by the field type > declaration in the schema.xml. > The current patch is factored such that Money utility functions and > configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), > while the MoneyType and MoneyValueSource lie in Solr. This was meant to > mirror the work being done on the spacial field types. > This patch will be getting used to power the international search > capabilities of the search engine at Etsy. > Also see WIKI page: http://wiki.apache.org/solr/MoneyFieldType -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3210) 3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial
[ https://issues.apache.org/jira/browse/SOLR-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223882#comment-13223882 ] Hoss Man commented on SOLR-3210: Suggested contents for http://lucene.apache.org/solr/tutorial.html ... {noformat} A copy of the tutorial for each version of Solr is included in the documentation for that release. Copies of the tutorial for the most recent releas of each major branch can also be found online: - 3.6 - https://builds.apache.org/job/Solr-trunk/javadoc/doc-files/tutorial.html";>4x trunk (unreleased for developers only) {noformat} ..and maybe at some point we can do a youtube embed of a walk through of the tutorial on that page as well > 3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial > -- > > Key: SOLR-3210 > URL: https://issues.apache.org/jira/browse/SOLR-3210 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 3.6 > > > Unless we have an alternate strategy in place for dealing with versioned docs > by the time 3.6 is released, then as a post-release task, once the 3.6 > javadocs are snapshoted online (ie: http://lucene.apache.org/solr/api/) the > current "online" copy of the tutorial > (http://lucene.apache.org/solr/tutorial.html) should be pruned down so that > it is just a link to the snapshot version released with 3.6 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3854) Non-tokenized fields become tokenized when a document is deleted and added back
[ https://issues.apache.org/jira/browse/LUCENE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223402#comment-13223402 ] Hoss Man commented on LUCENE-3854: -- i tried arguing a long time ago that IndexReader.document(...) should return "Map" since known of the Document/Field object metdata makes sense at "read" time ... never got any buy in from anybody else. > Non-tokenized fields become tokenized when a document is deleted and added > back > --- > > Key: LUCENE-3854 > URL: https://issues.apache.org/jira/browse/LUCENE-3854 > Project: Lucene - Java > Issue Type: Bug > Components: core/index >Affects Versions: 4.0 >Reporter: Benson Margulies > > https://github.com/bimargulies/lucene-4-update-case is a JUnit test case that > seems to show a problem with the current trunk. It creates a document with a > Field typed as StringField.TYPE_STORED and a value with a "-" in it. A > TermQuery can find the value, initially, since the field is not tokenized. > Then, the case reads the Document back out through a reader. In the copy of > the Document that gets read out, the Field now has the tokenized bit turned > on. > Next, the case deletes and adds the Document. The 'tokenized' bit is > respected, so now the field gets tokenized, and the result is that the query > on the term with the - in it no longer works. > So I think that the defect here is in the code that reconstructs the Document > when read from the index, and which turns on the tokenized bit. > I have an ICLA on file so you can take this code from github, but if you > prefer I can also attach it here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3140) Make omitNorms default for all numeric field types
[ https://issues.apache.org/jira/browse/SOLR-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219389#comment-13219389 ] Hoss Man commented on SOLR-3140: bq. Is there a better place to set this default than in init() in the new base class? probably not bq. Should StrField or other fields also have omitNorms as default? I don't think so? if you search on a multivalued string field like "keywords" or "tags" it's reasonable to want length normalization to be a factor to prevent keyword stuffing. > Make omitNorms default for all numeric field types > -- > > Key: SOLR-3140 > URL: https://issues.apache.org/jira/browse/SOLR-3140 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: omitNorms > Fix For: 4.0 > > Attachments: SOLR-3140.patch > > > Today norms are enabled for all Solr field types by default, while in Lucene > norms are omitted for the numeric types. > Propose to make the Solr defaults the same as in Lucene, so that if someone > occasionally wants index-side boost for a numeric field type they must say > omitNorms="false". This lets us simplify the example schema too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3175) simplify & add test to ensure various query "escape" functions are in sync
[ https://issues.apache.org/jira/browse/SOLR-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218892#comment-13218892 ] Hoss Man commented on SOLR-3175: Suggested approach: * replace the {{if (c == 'x' || c == 'y' || ... )}} meme with a Set lookup * make the Set used in each case public static final * add a unit test that asserts the maps are equivilent when they are suppose to be equivilent, or supersets when they are suppose to be supersets. > simplify & add test to ensure various query "escape" functions are in sync > -- > > Key: SOLR-3175 > URL: https://issues.apache.org/jira/browse/SOLR-3175 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man > > We have three query syntax escape related functions (that i know) of that > can't be refactored... > * QueryParser.escape > ** canonical > * ClientUtils.escapeQueryChars > ** part of solrj, doesn't depend directly on QueryParser so that Solr clients > on't need the query parser jar locally > * SolrPluginUtils.partialEscape > ** designed to be a negative subset of the full set (ie: all chars except > +/-/") > ...we should figure out a way to assert in our tests that these are all in > agreement (or at least as much as they are ment to be) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser
[ https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218845#comment-13218845 ] Hoss Man commented on SOLR-2368: bq. Are you saying it would be possible to define something like this in solrconfig.xml ...something like that would certainly be possible if we changed the QParsers to start doing interesting things with their init params (presumably defaults there would be the lowest possible level defaults, overridden by things like request handler defaults/invariants/appends ... and it would really make sense to allow invariants/appends in the qparser init). but that would only really help with the backcompat "locked down" dismax situation i'm concerned with if we made sure all of those init params were also used in the implicitly created instance of "dismax" what i had in mind was actually far simpler... * "dismax" is implicitly an instance of DismaxQParserPlugin (unless overridden in solrconfig.xml) .. just like today * "edismax" is implicitly an instance of ExtendedDismaxQParserPlugin (unless overridden in solrconfig.xml) ... just like today * ExtendedDismaxQParserPlugin works exactly as it does today but instead of all the hardcoded default param values sprinkled around ExtendedDismaxQParser we put them all in a static Map or SolrParams instance and add a constructor arg to override those defaults * DismaxQParserPlugin gets changed to look something like... {code} final static SolrParams REALLY_LIMITED_DEFAULTS = new SolrParams("uf", "-*", ...); public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new ExtendedDisMaxQParser(qstr, localParams, params, req, REALLY_LIMITED_DEFAULTS); } {code} ...using that new ExtendedDisMaxQParser constructor arg to override the defaults * DisMaxQParser.java gets svn removed because it's no longer needed. ...all of which could then be enhanced with init based overrides of those defaults like you suggested. > Improve extended dismax (edismax) parser > > > Key: SOLR-2368 > URL: https://issues.apache.org/jira/browse/SOLR-2368 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Labels: QueryParser > > This is a "mother" issue to track further improvements for eDismax parser. > The goal is to be able to deprecate and remove the old dismax once edismax > satisfies all usecases of dismax. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3157) custom logging
[ https://issues.apache.org/jira/browse/SOLR-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218816#comment-13218816 ] Hoss Man commented on SOLR-3157: {quote} IMO, it's actually higher importance that we have better logging for ourselves so we can more easily debug our tests. {quote} agreed ... but we shouldn't break the log format of the most important line Solr logs if we can avoid it (particularly since the format was chosen specifically to be easy to parse). i also don't understand why you think those changes to SolrCore.execute's "info" log make anything better? your change _removed_ really useful info and made the messages harder to parse because it removed the consistent key=val pattern for the path & param vals. {quote} Maybe there's a way I can log different things when using the test formatter and restore the production log format to it's former glory. {quote} I don't understand why the structure of the _string_ used in this one info message has anything to do with your goal of better test logging using the SolrLogFormatter? why can't that _string_ format stay exactly identical regardless of whether it's in a test (and the _record_ gets formated with all of your new thread and meatdata goodness) or not ? as things stand now (with your latest commit #1294911) these messages are still broken compared to 3x because they don't include the SolrCore name (the "logid" used to init the StringBuilder before you changed it) ... that seems pretty damn important (although i realize now i somehow didn't mention that in my earlier comment) {noformat} 3x example... Feb 28, 2012 5:38:33 PM org.apache.solr.core.SolrCore execute INFO: [core0] webapp=/solr path=/select/ params={q=*:*&sort=score+desc} hits=0 status=0 QTime=11 trunk example... Feb 28, 2012 5:39:02 PM org.apache.solr.core.SolrCore execute INFO: webapp=/solr path=/select/ params={q=*:*&sort=score+desc} hits=0 status=0 QTime=40 {noformat} > custom logging > -- > > Key: SOLR-3157 > URL: https://issues.apache.org/jira/browse/SOLR-3157 > Project: Solr > Issue Type: Test >Reporter: Yonik Seeley >Priority: Blocker > Attachments: SOLR-3157.patch, jetty_threadgroup.patch > > > We need custom logging to decipher tests with multiple core containers, > cores, in a single JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3157) custom logging
[ https://issues.apache.org/jira/browse/SOLR-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218398#comment-13218398 ] Hoss Man commented on SOLR-3157: Yonik: your changes in r1294212 break the SolrCore.execute log format conventions we've had in place since SOLR-267, which breaks some log processing code i have (and since the whole point of SOLR-267 was to make it easy for people to write log parses, i'm guessing i'm not the only one) Notably: * you changed the "path" and "params" key=val pairs so they no longer include the key, just the val -- this doesn't really make sense to me since the whole point of those log lines is that they are suppose to be consistent and easy to parse. * you removed the webapp key=val pair completely (comment says "multiple webaps are no longer best practise" but that doesn't mean people don't use them or that we should just suddenly stop logging them. > custom logging > -- > > Key: SOLR-3157 > URL: https://issues.apache.org/jira/browse/SOLR-3157 > Project: Solr > Issue Type: Test >Reporter: Yonik Seeley >Priority: Blocker > Attachments: SOLR-3157.patch, jetty_threadgroup.patch > > > We need custom logging to decipher tests with multiple core containers, > cores, in a single JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3156) Check for locks on startup
[ https://issues.apache.org/jira/browse/SOLR-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217934#comment-13217934 ] Hoss Man commented on SOLR-3156: Luca: the change to SolrCore looks good to me ... the one thing i might suggest is adding an ERROR log just before you throw the exception (i'm in the "log early" team) The test looks awesome, but *PLEASE* trim those solrconfig files down so that they only contain the 5-6 minimum lines they need inorder for the test to be meaningful ... we have far too many big bloated test configs already, the goal is to stop adding new ones and make sure each test config has a specific and easy to see purpose. > Check for locks on startup > -- > > Key: SOLR-3156 > URL: https://issues.apache.org/jira/browse/SOLR-3156 > Project: Solr > Issue Type: Improvement >Reporter: Luca Cavanna > Attachments: SOLR-3156.patch > > > When using simple or native lockType and the application server is not > shutdown properly (kill -9), you don't notice problems until someone tries to > add or delete a document. In fact, you get errors every time Solr opens a new > IndexWriter on the "locked" index. I'm aware of the unlockOnStartup option, > but I'd prefer to know and act properly when there's a lock, and I think it > would be better to know on startup, since Solr is not going to work properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3141) Deprecate OPTIMIZE command in Solr
[ https://issues.apache.org/jira/browse/SOLR-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217779#comment-13217779 ] Hoss Man commented on SOLR-3141: I don't have the energy to really get in depth with all of the discussion that's taken place so far, i'll try to keep my comments brief: 0) i'm a fan of the patch currently attached. 1) i largely agree with most of yonik's points -- this is a documentation problem first and foremost. Saying that all people who optimize are wrong is ridiculous, and breaking something that has use and value for a set of people just because some other set of people are using it foolishly seems really absurd. 2) changing the "optimize" command to be a no-op with a warning logged, or a failure, where the documented "fix" to regain old behavior for people who genuinely need it is to search & replace the string "optimize" with some new string "forceMerge" seems uterly absurd to me. this is not the first time we've had a param name that people later regretted giving the name that we did -- are we going to change _all_ of them for 4.0? Unlike a method renamed in java code where it's easy to see how the change affects you because of compilation failures, this kind of HTTP param change is a serious pain in the ass for people with client apps written using multiple languages/libraries ... naming consistency for existing users seems far more important then having perfect names. 3) Even if the goal is to force people to evaluate whether they really want to merge down to one segment, we have to consider how hard we make things for people when the answer is "yes". If someone is using a client library/app to talk to Solr it may not be easy/simple/possible for them to replace "optimize" with "forceMerge" or something like it w/o mucking in the internals of that library -- there's no reason to piss off users like that. 4) any discussion about renaming/removing "optimize" in the Solr HTTP APIs should really consider how that will impact a few other user visible things... * {{}} hooks in solrconfig and the corisponding SolrEventListener.postOpimize method * SolrDeletionPolicy has options related to how many optimized indexes to keep * spellchecker has options relating to building on optimize (although if i remember correctly there is a bug about this being broken so it can probably die no problem) 5) Assuming that too many people optimize when the shouldn't, either out of ignorance or because their tools do it out of ignorance and we want to help minimize that moving forward; and given my opinion that renaming "optimize" will only hurt people w/o actually helping the root problem -- here's my straw man proposal to try and improve the situation (similar to what jan suggested but taking into account that we already support a "maxSegments" option when doing optimize commands) ... * commit the attached patch as is (it's just plain a good idea, regardless of anything else we might do) * change CommitUpdateCommand.maxOptimizeSegments so it defaults to "-1" and document that when the value is less then 0 it means the UpdateHandler configuration determines the value. * add a new {{}} config option to {{}} - make the UpdateHandler use that value anytime CommitUpdateCommand.maxOptimizeSegments is less then 0, and for backcompat have it default to "1" if not specified. * update the example configs to include {{999}} with a comment warning against hte evils of over-optimization * change the code in Solr which deals with {{}} formated instructions so that any SolrParams in the request with names the same as xml attributes override the attributes -- ie: {{POST /update?maxSegments=4 with data: }} should result in a CommitUpdateCommand with maxOptimizeSegments=4 The end result being: * new users who start with new configs have an UpdateHandler that is going effectively ignore "optimize" commands that don't specify a "maxSegments" * nothing breaks for existing users * existing users who only want to allow optimize commands when "maxSegments" is specified can cut/paste that oneline {{}} config * new and existing users who want Solr to ignore all optimize commands, even when they do have a "maxSegments", can configure an invariant maxSegments=999 param on the affected request handlers > Deprecate OPTIMIZE command in Solr > -- > > Key: SOLR-3141 > URL: https://issues.apache.org/jira/browse/SOLR-3141 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 3.5 >Reporter: Jan Høydahl > Labels: force, optimize > Fix For: 3.6 > > Attachments: SOLR-3141.patch, SOLR-3141.patch > > > Background: LUCENE-3454 renames optimize() as forceMerge(). Please read that > issue first. > Now th
[jira] [Commented] (SOLR-3161) Use of 'qt' should be restricted to searching and should not start with a '/'
[ https://issues.apache.org/jira/browse/SOLR-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217382#comment-13217382 ] Hoss Man commented on SOLR-3161: -0 1) there are plenty of people who are happily using "qt" to dynamicly pick their request handler who don't care about securing their solr instances -- we shouldn't break things for them if we can avoid it. 2) assuming qt should be allowed only if it is an instance of solr.SearchHandler seems narrow minded to me -- it puts a totally arbitrary limitation on the ability for people to have their own request handlers that are treated as "first class citizens" and seems just as likely to lead to suprise and frustration as it is to appreciation for the "safety" of the feature (not to mention it procludes perfectly safe "query" type handlers like MLTHnadler and AnalysisRequestHandler if he root goal is "make solr safer for people who don't want/expect "qt" based requests then unless i'm overlooking something it seems like there is a far simpler and more straightforward solution... a) change the example solrocnfig to use handleSelect="false" b) remove the (long ago deprecated) SolrServlet if handleSelect == false, then the request dispatcher won't look at "/select" requests at all (unless someone has a handler named "/select") and it would do dispatching based on the "qt" param. currently if that's false the logic falls throough to the SolrServlet, but if that's been removed then the request will just fail. So new users who copy the example will have only path based request handlers by default, and will have to go out of their way to set handleSelect=true to get qt based dispatching. Bonus points: someone can write a DispatchingRequestHandler that can optionally be configured with some name (such as "/select") and does nothing put look for a "qt" param and forward to the handler with that name -- but it can have configuration options indicating which names are permitted (and any other names would be rejected) ...on the whole, compared to the original suggestion in this issue, that seems a lot safer for people who want safety, and a lot simpler to document. comments? > Use of 'qt' should be restricted to searching and should not start with a '/' > - > > Key: SOLR-3161 > URL: https://issues.apache.org/jira/browse/SOLR-3161 > Project: Solr > Issue Type: Improvement > Components: search, web gui >Reporter: David Smiley >Assignee: David Smiley > Fix For: 3.6, 4.0 > > > I haven't yet looked at the code involved for suggestions here; I'm speaking > based on how I think things should work and not work, based on intuitiveness > and security. In general I feel it is best practice to use '/' leading > request handler names and not use "qt", but I don't hate it enough when used > in limited (search-only) circumstances to propose its demise. But if someone > proposes its deprecation that then I am +1 for that. > Here is my proposal: > Solr should error if the parameter "qt" is supplied with a leading '/'. > (trunk only) > Solr should only honor "qt" if the target request handler extends > solr.SearchHandler. > The new admin UI should only use 'qt' when it has to. For the query screen, > it could present a little pop-up menu of handlers to choose from, including > "/select?qt=mycustom" for handlers that aren't named with a leading '/'. This > choice should be positioned at the top. > And before I forget, me or someone should investigate if there are any > similar security problems with the shards.qt parameter. Perhaps shards.qt can > abide by the same rules outlined above. > Does anyone foresee any problems with this proposal? > On a related subject, I think the notion of a default request handler is bad > - the default="true" thing. Honestly I'm not sure what it does, since I > noticed Solr trunk redirects '/solr/' to the new admin UI at '/solr/#/'. > Assuming it doesn't do anything useful anymore, I think it would be clearer > to use instead of > what's there now. The delta is to put the leading '/' on this request handler > name, and remove the "default" attribute. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3804) Swap Features and News on the website.
[ https://issues.apache.org/jira/browse/LUCENE-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212866#comment-13212866 ] Hoss Man commented on LUCENE-3804: -- There's a couple of problems here we need to address... 1) features are listed on /solr/index.html, but there is also a right nav link to /solr/features.html 2) duplicate content on both /solr/features.html and /solr/index.html that will only increase that confusion 3) "Title" metadata from features.mdtext appearing the boxy of /solr/index.html 4) #1 & #2 both affect the /core/... urls as well (that features.mdtext evidently doens't use the "Title" attribute tough) The fixes i would suggest are... a) add a redirect for /solr/features.html -> /solr/ (and likewise for core) b) remove "Features" from the right nav c) either remove the "Title" metadata from pages being included, or stop doing this as an include and put the content direclty in index.mdtext -- i would suggest the later since it will make it more straight forward for editing in the future. > Swap Features and News on the website. > -- > > Key: LUCENE-3804 > URL: https://issues.apache.org/jira/browse/LUCENE-3804 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Mark Miller >Assignee: Mark Miller > > I think we can do even better, but that is a nice, easy incremental > improvement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3142) remove O(n^2) slow slow indexing defaults in DataImportHandler
[ https://issues.apache.org/jira/browse/SOLR-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212761#comment-13212761 ] Hoss Man commented on SOLR-3142: agreed, was just noting why i _think_ the original default was true.. > remove O(n^2) slow slow indexing defaults in DataImportHandler > -- > > Key: SOLR-3142 > URL: https://issues.apache.org/jira/browse/SOLR-3142 > Project: Solr > Issue Type: Bug >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 3.6, 4.0 > > Attachments: SOLR-3142.patch > > > By default, dataimporthandler optimizes the entire index when it commits. > This is bad for performance, because it means by default its doing a very > heavy index-wide operation even for an incremental update... essentially > O(n^2) indexing. > All that is needed is to set optimize=false by default. If someone wants > to optimize, they can either set optimize=true or explicitly optimize > themselves. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3142) remove O(n^2) slow slow indexing defaults in DataImportHandler
[ https://issues.apache.org/jira/browse/SOLR-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212752#comment-13212752 ] Hoss Man commented on SOLR-3142: FWIW: I'm pretty sure the original assumption here was that in the (relatively common) usecase of doing a full-import rebuild on a regular basis (ie: nightly) that it can be handy to have it auto-optimize when you are done. I think the real problem is that that assumption was never challeneged regarding things like delta import. so an argument could be made the the default should still be to optimze=true on full-import, and optimize=false on delta import ... but i'm not going to make that argument, i think this it's silly to assume true in either case. (particularly since a parameterized full import might actually be a rapidly repeating incremental) > remove O(n^2) slow slow indexing defaults in DataImportHandler > -- > > Key: SOLR-3142 > URL: https://issues.apache.org/jira/browse/SOLR-3142 > Project: Solr > Issue Type: Bug >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 3.6, 4.0 > > Attachments: SOLR-3142.patch > > > By default, dataimporthandler optimizes the entire index when it commits. > This is bad for performance, because it means by default its doing a very > heavy index-wide operation even for an incremental update... essentially > O(n^2) indexing. > All that is needed is to set optimize=false by default. If someone wants > to optimize, they can either set optimize=true or explicitly optimize > themselves. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209630#comment-13209630 ] Hoss Man commented on LUCENE-3792: -- StrawMan suggestion off the top of my head: * rename NOT_ANALYZED to something like KEYWORD_ANALYZED * document KEYWORD_ANALYZED as being a convenience flag (and/or optimization?) for achieving equivalent behavior as using PerFieldAnalyzer with KeywordAnalyzer for this field, and keep using / re-word rmuir's patch warning to make it clear that if you use this at index time, any attempts to construct queries against it using the QueryParser will need KeywordAnalyzer. ...would that flag name == analyzer name equivalence help people remember not to get trapped by this? > Remove StringField > -- > > Key: LUCENE-3792 > URL: https://issues.apache.org/jira/browse/LUCENE-3792 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 4.0 >Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3792_javadocs_3x.patch, > LUCENE-3792_javadocs_3x.patch > > > Often on the mailing list there is confusion about NOT_ANALYZED. > Besides being useless (Just use KeywordAnalyzer instead), people trip up on > this > not being consistent at query time (you really need to configure > KeywordAnalyzer for the field > on your PerFieldAnalyzerWrapper so it will do the same thing at query time... > oh wait > once you've done that, you dont need NOT_ANALYZED). > So I think StringField is a trap too for the same reasons, just under a > different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3005) Content-Type disappear
[ https://issues.apache.org/jira/browse/SOLR-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207236#comment-13207236 ] Hoss Man commented on SOLR-3005: Chris: +1, commit. > Content-Type disappear > -- > > Key: SOLR-3005 > URL: https://issues.apache.org/jira/browse/SOLR-3005 > Project: Solr > Issue Type: Bug > Components: Response Writers >Affects Versions: 3.5 > Environment: Solr 3.5.0 >Reporter: Gasol Wu >Assignee: Chris Male > Attachments: SOLR-3005.patch, SOLR-3005.patch > > > i expect that query always return Content-Type, but after SOLR-1123 had > committed, it got chance to return nothing if you leave out all of > queryResponseWriter in solrconfig.xml. however i attach a small patch that > will correct this situation. > It look like DEFAULT_RESPONSE_WRITERS never called init method in > org.apache.solr.core.SolrCore -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS
[ https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205186#comment-13205186 ] Hoss Man commented on LUCENE-3750: -- bq. I'll contact you on IRC and we can work through it. I realize now i didn't not finish explaining the concerns i was trying to get to in that first para. my comment was not intended to mean "i can't get local doc building to work for the www site, therefore i'm leary of using local doc building for the versioned docs". my comment was to be "if 1 out of N committers who have tried doing local www site builds can't get it to work, then that doesn't bode well for asking non-committer contributors to be able to use the same markdown tools to do local doc building of versioned docs that they might want to contribute patches for." that smells like a big red flag to me. > Convert Versioned docs to Markdown/New CMS > -- > > Key: LUCENE-3750 > URL: https://issues.apache.org/jira/browse/LUCENE-3750 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > > Since we are moving our main site to the ASF CMS (LUCENE-2748), we should > bring in any new versioned Lucene docs into the same format so that we don't > have to deal w/ Forrest anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205156#comment-13205156 ] Hoss Man commented on SOLR-3076: bq. Maybe there can be field aliases? Eg, book_page_count:[0 to 1000] and chapter_page_count[10:40], and the QP is told to map book_page_count -> parent:size and chapter_page_count -> child:size? Or maybe we let the user explicitly scope the field, eg chapter:size, book:size, book:title, etc. Not sure... Hmmm... i kind of understand what you're saying; but the part i'm not understanding is even if you had field aliasing like that, given some query string like... {code} book_page_count:[0 TO 1000] and chapter_page_count[10 TO 40] {code} ..how would the parser know whether the user was asking for the results to be "book documents" matching that criteria (1-1000 pages and containing at least one chapter child containing 10-40 pages), or "chapter documents" matching that criteria (10-40 pages contained in a book of 1-1000 pages) or "page documents" (all pages in containing in a chapter of 10-40 total pages, contained in a book of 1-1000 total pages) ? I mean: it seems possible, and a QParser like that could totally support configuring those types of file mappings / hierarchy definitions in init params, but perhaps we should focus on the more user explicit, direct mapping type QParser type approach Mikhail has already started on for now, and consider that as an enhancement later? (especially since it's not clear how the indexing side will be managed/enforced -- depending on how that shapes up, it might be possible for a QParser like you're describing, or perhaps _all_ QParsers to infer the field rules from the schema or some other configuration) I think the syntax in Mikhail's BlockJoinParentQParserPlugin looks great as a straight forward baseline implementation. The one straw man suggestion i might toss out there for consideration would be to invert the use of the "filter" and "v" local params, so instead of... {code} {!parent filter="parent:true"}child_name:b {!parent filter="parent:true"} {code} ...it might be... {code} {!parent of="child_names:b"}parent:true {!parent}parent:true {code} ...people may find that easier to read as a way to understand that the final query will return "parent documents" constraint such that those parent documents have children matching the "of" query. The one thing i don't like this "of" idea is that (compared to the "filter" param Mikhail uses) it might be more tempting for people to use something like... {code} // WRONG! (i think) q={!parent of="child_names:b"}some_parent_field:foo {code} ...when they mean to write something like this... {code} q={!parent of="child_names:b"}some_query_that_identifies_the_set_of_all_parents fq=some_parent_field:foo {code} ...because as i understand it, it's important for the "parentFilter" to identify *all* of the parent documents, even ones you may not want returned, so that the ToParentBlockJoinQuery knows how to identify the parent of each document (correct?) This type of user confusion is still possible with the syntax Mikhail's got, but i suspect it will be less likely --- In any case, i wanted to put the idea out there. Given McCandless supposition that the parent/child relationships are likely to be very consistent, not very deep, and not vary from query to query, one thing we could do to to help mitigate this possible confusion would be: * make the "filter" param name much longer and verbose, ie: {{setOfAllParentsQuery}} * make the param optional, and have it default to something specified as an init param, ie: {{defaultSetOfAllParentsQuery}} * make the init param mandatory That way, in the common case people will configure things like... {code} type:parent {code} ..and their queries will be simple... {code} q={!parent} (all parent docs) q={!parent}foo:bar (all parent docss that contain kid docs matching foo:bar) {code} ...but it will still be possible for people with more complex usecases with do more complex things. Mikhail: some other minor feedback on the parts i understood of your patch that i understood (note: my lack of understanding is not a fault of your patch, it's just that most of the block join stuff is very foreign to me)... * please prune down "solrconfig-bjqparser.xml" so it contains only the absolute minimum things you need for the test case, it makes it a lot easier for people to review the patch, and for users to understand what is necessary to utilize features demoed in the test (we have a lot of old bloaded solrconfig files i nthe test dir, but we're trying to stop doing that) * the test would be a bit easier to follow if you used different letters for the parent fields vs the child fields (abcdef, vs xyz for example) * it would be good to have tests verifying that nested parent queries
[jira] [Commented] (SOLR-2802) Toolkit of UpdateProcessors for modifying document values
[ https://issues.apache.org/jira/browse/SOLR-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205114#comment-13205114 ] Hoss Man commented on SOLR-2802: bq. In my opinion, I think if a user asks for min or max or some other computation, and this is not possible, it should return an error? otherwise why did they configure this in their chain? Agreed, i'm not sure what usecase i had in mind when i wrote min/max to "pass through" in this situation, but failing hard is definitely better -- at least by default. If someone comes up with a reason not to fail, we can always add an option later. I've committed this change in r1242625. bq. I think min/max should not extend this type-unsafe Subset base, as they should not return a subset anyway, but a singleton, and the input must be comparable... If you'd like to take a stab at refactoring by all means be me guest. It's true, these instances don't need to return a subset, but even if we change them to not subclass that particular base class, I don't see any simple way to rewrite them such that they only accept a Collection. UpdateProcessors deal with SolrInputDocuments & SolrInputFields that are just bags of objects; the schema hasn't been consulted yet, so we don't have any hard type information about the types of these Objects (and even if we could we wouldn't want to consult the schema yet, because some of these "fields" might be for input purposes only -- some UpdateProcessor down the pipe might be copying/moving them to different fields). So if you want these Min/Max processors to have APIs that strictly enforce Collection>, then some code somewhere needs to check that and cast appropriately -- at the moment, they delegate that responsibility to Collections.min and Collections.max, because that class does that check anyway as it dos it's computation. Personally i think the current impl is better anyway because in the common case of clients sending "clean data" we don't waste cycles checking the type of every Object sent before asking Collections.class to find the min/max and doing the check again anyway. if an exceptional case happens, we catch/log/wrap the exception. > Toolkit of UpdateProcessors for modifying document values > - > > Key: SOLR-2802 > URL: https://issues.apache.org/jira/browse/SOLR-2802 > Project: Solr > Issue Type: New Feature >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.0 > > Attachments: SOLR-2802_update_processor_toolkit.patch, > SOLR-2802_update_processor_toolkit.patch, > SOLR-2802_update_processor_toolkit.patch, > SOLR-2802_update_processor_toolkit.patch, > SOLR-2802_update_processor_toolkit.patch > > > Frequently users ask about questions about things where the answer is "you > could do it with an UpdateProcessor" but the number of our of hte box > UpdateProcessors is generally lacking and there aren't even very good base > classes for the common case of manipulating field values when adding documents -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3076) Solr should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204868#comment-13204868 ] Hoss Man commented on SOLR-3076: bq. Maybe instead of opening up end-user QP syntax to control block joins, there should be configuration that tells the query parser how to create the parent bits filter, which fields are in "child scope" vs "parent scope", etc.? wouldn't that still need to be a query time option to be useful to the full capability of block join?? from what i remember about block join: * you can have arbitrary depth of parent>child->grandchild->etc... * there's nothing prevent parents and children having the same fields ...correct? so wouldn't it be kind of limiting if those types of options were configuration that couldn't be done per query/filter? (ie: in this fq i want only docs whose parents are size:[0 TO 1000] but in this other fq i want docs who are themselves size:[10 TO 40] ... if perhaps "parents" are books and the children being queried are "chapters" and "size" is number of pages) > Solr should support block joins > --- > > Key: SOLR-3076 > URL: https://issues.apache.org/jira/browse/SOLR-3076 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll > Attachments: SOLR-3076.patch, bjq-vs-filters-backward-disi.patch, > bjq-vs-filters-illegal-state.patch, parent-bjq-qparser.patch, > parent-bjq-qparser.patch, solrconf-bjq-erschema-snippet.xml > > > Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS
[ https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204827#comment-13204827 ] Hoss Man commented on LUCENE-3750: -- FWIW: while i love the ability to do quick edits using the CMS bookmarklet, i have had absolutely zero success doing a full site build on my local machine -- i'm not even getting errors, i'm just getting no output. this may just be something really flakey about my machine, but it begs the question: for our _versioned_ per release documentation, is there really any advantage to using markdown over simply editing HTML directly? we used forrest for the versioned docs solely because once upon a time the entire site use to be versioned, and it was easy to just extract the versioned docs from the non-versioned docs and keep using forrest for both and have a similar look/feel, but is there really any reason why the versioned docs we ship in the release need to be in markdown and/or have the same look/feel as the website? yes, we'll archive them on the website for refrence, but we also archive the javadocs and we've never tried to make them look like the forrest docs (or the new site style) why not just leave the versioned docs looking very simple, and very distinctive from the website, so when people browsing them online see them it's very obvious that they are *not* just part of the website, they are a snapshot of per version documentation? > Convert Versioned docs to Markdown/New CMS > -- > > Key: LUCENE-3750 > URL: https://issues.apache.org/jira/browse/LUCENE-3750 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > > Since we are moving our main site to the ASF CMS (LUCENE-2748), we should > bring in any new versioned Lucene docs into the same format so that we don't > have to deal w/ Forrest anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3105) Add analysis configurations for different languages to the example
[ https://issues.apache.org/jira/browse/SOLR-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203115#comment-13203115 ] Hoss Man commented on SOLR-3105: I'm with robert ... this issue is about coming up with good example configs for as many languages as we can. at the moment we have one big fat kitchen-sink set of example configs, so lets use what we've got. If people care strongly, we can track cleaning up and re-organizing the examples (to use xinclude, or add multiple more specifically targed sets of example configs, etc...) in a different issue. > Add analysis configurations for different languages to the example > -- > > Key: SOLR-3105 > URL: https://issues.apache.org/jira/browse/SOLR-3105 > Project: Solr > Issue Type: Improvement >Reporter: Robert Muir > Fix For: 3.6, 4.0 > > Attachments: SOLR-3105.patch > > > I think we should have good baseline configurations for our supported > analyzers > so that its easy for people to get started. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3095) update processor chain should check for "enable" attribute on all processors
[ https://issues.apache.org/jira/browse/SOLR-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200268#comment-13200268 ] Hoss Man commented on SOLR-3095: I started trying to hack together a patch for this and then realized: i'm pretty sure this already works because of how PluginInfo.loadSubPlugins works (it ignores "children" that is false==isEnabled()) so all we need is a test case to verify and future proof. > update processor chain should check for "enable" attribute on all processors > > > Key: SOLR-3095 > URL: https://issues.apache.org/jira/browse/SOLR-3095 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man > > many types of plugins in Solr allow you to specify an "enabled" boolean when > configuring them, so you can use system properties in the configuration file > to determine at run time if they are actually used -- we should add low level > support for this type of setting on the individual processor declarations in > the UpdateRequestProcessorChain as well, so individual update processor > factories don't have to deal with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3017) Allow edismax stopword filter factory implementation to be specified
[ https://issues.apache.org/jira/browse/SOLR-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200264#comment-13200264 ] Hoss Man commented on SOLR-3017: bq. Nope. I've just committed this change in trunk. There wasn't a good reason to use a more specific type (and it was not used anywhere). FWIW: I'm pretty sure the only reason any of these factories are declared to return specific types (instead of just TokenStream) was SOLR-396 -- which isn't really that important now that lucene & solr development is in a single repo and people can easily commit factories at the same time that new impls are added. > Allow edismax stopword filter factory implementation to be specified > > > Key: SOLR-3017 > URL: https://issues.apache.org/jira/browse/SOLR-3017 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.0 >Reporter: Michael Dodsworth >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3017-without-guava-alternative.patch, > SOLR-3017.patch, SOLR-3017.patch, edismax_stop_filter_factory.patch > > > Currently, the edismax query parser assumes that stopword filtering is being > done by StopFilter: the removal of the stop filter is performed by looking > for an instance of 'StopFilterFactory' (hard-coded) within the associated > field's analysis chain. > We'd like to be able to use our own stop filters whilst keeping the edismax > stopword removal goodness. The supplied patch allows the stopword filter > factory class to be supplied as a param, "stopwordFilterClassName". If no > value is given, the default (StopFilterFactory) is used. > Another option I looked into was to extend StopFilterFactory to create our > own filter. Unfortunately, StopFilterFactory's 'create' method returns > StopFilter, not TokenStream. StopFilter is also final. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200130#comment-13200130 ] Hoss Man commented on SOLR-3047: bq. Hoss, is there a way I can send you the example privately? [I'd rather not|https://people.apache.org/~hossman/#private_q] if you can't share the configs you are using, can't you at least add a quick example of something demonstrating your problem to the example schemx.xml and post that? I just tried this example from Solr 3.5.0 (alphaNameSort uses KeywordTokenizer) and got exactly what i expected... {code} http://localhost:8983/solr/select?debugQuery=true&defType=dismax&qf=name&pf=alphaNameSort&q=foo%20bar%20baz +((DisjunctionMaxQuery((name:foo)) DisjunctionMaxQuery((name:bar)) DisjunctionMaxQuery((name:baz)) )~3 ) DisjunctionMaxQuery((alphaNameSort:foobarbaz)) {code} > DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses > KeywordTokenizer instead of StandardTokenizer or Whitespace > --- > > Key: SOLR-3047 > URL: https://issues.apache.org/jira/browse/SOLR-3047 > Project: Solr > Issue Type: Bug >Reporter: Antony Stubbs > > Has this got something to do with the minimum clause = 2 part in the code? It > drops it without warning - IMO it should error out if the field isn't > compatible. > If it is on purpose - i don't see why. I split with the ngram token filter, > so there is def more than 1 clause in the indexed field. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3028) Support for additional query operators (feature parity request)
[ https://issues.apache.org/jira/browse/SOLR-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200119#comment-13200119 ] Hoss Man commented on SOLR-3028: #1) maybe i'm missundertanding SOLR-2866 ... it talks about synonyms, but the crux of it is really indexing multiple variants of a stemmed word with informatino about wether it is a stem or not, and then being able to query on both -- your requrest seems to heavily overlap with that -- in Victor's case he may be using a dictionary based stemmer, and in your case you may want a hueristic stemmer, but the underlying plumbing should probably all be the same. #2) sorry, yeah i missed your label and only looked at the example. quorom search is definitely possible using the dismax parse with the mm param, but there is no explicit syntax for it in any parser i know of at the moment. #3) the curly braces in that example were just me being explicit about which parser was in use via local params -- that's not the query syntax. you could just as easily do... {code} defType=surround&q=(this W that) AND (other W next) {code} In generallymy suggestion for moving forward would be to break these individual requests out into 3 distinct issues since they are largely unrelated (or only open two issues and ask about #1 in SOLR-2866 .. make an offshoot issue as needed) individual issues with more direct issue summaries are easier to track and more likely to encourage patches from people who see the summaries and realize it's something they are interested in. > Support for additional query operators (feature parity request) > --- > > Key: SOLR-3028 > URL: https://issues.apache.org/jira/browse/SOLR-3028 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Mike > Labels: operator, queryparser > Original Estimate: 6h > Remaining Estimate: 6h > > I'm migrating my system from Sphinx Search, and there are a couple of > operators that are not available to Solr, which are available in Sphinx. > I would love to see the following added to the Dismax parser: > 1. Exact match. This might be tricky to get right, since it requires work on > the index side as well[1], but in Sphinx, you can do a query such as [ > =running walking ], and running will have stemming off, while walking will > have it on. > 2. Term quorum. In Sphinx and some commercial search engines (like Recommind, > Westlaw and Lexis), you can do a search such as [ (cat dog goat)/15 ], and > find the three words within 15 terms of each other. I think this is possible > in the backend via the span query, but there's no front end option for it, so > it's quite hard to reveal to users. > 3. Word order. Being able to say, "this term before that one, and this other > term before the next" is something else in Sphinx that span queries support, > but is missing in the query parser. Would be great to get this in too. > These seem like the three biggest missing operators in Solr to me. I would > love to help move these forward if there is any way I can help. > [1] At least, *I* think it does. There's some discussion of one way of doing > exact match like support in SOLR-2866. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
[ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200102#comment-13200102 ] Hoss Man commented on SOLR-2649: bq. Counting multiple terms as 1 because they are in parenthesis together doesn't seem like a good idea to me. I disagree, but it definitely just seems like a matter of opinion -- i don't know that we could ever come up with something that makes sense in all use cases personally i think the sanest change would be to say that "mm" applies to all top level SHOULD clauses in the query (regardless of wether they have an explicit OR or not) -- exactly as it always has in dismax. if a top level clause is a nested boolean queries, then "mm" shouldn't apply to those because it doesn't make sense to blur the "count" of how many SHOULD clauses there are at the various levels. would would mm=5 mean for a query like "q=X AND Y (a b) (c d) (e f) (g h)" if you looked at all the nested subqueries? that only 5 of those 8 (lowercase) leaf level clauses are required? how would that be implemented on the underlying BooleanQuery objects w/o completely flattening the query (which would break the intent of the user when they grouped them) ... it seems like mm=5 (or mm=100%) should mean 5 (or 100%) of the top level SHOULD clauses are required ... the default query op should determine how any top level clauses that are BooleanQueries are dealt with. ...but that's just my opinion. > MM ignored in edismax queries with operators > > > Key: SOLR-2649 > URL: https://issues.apache.org/jira/browse/SOLR-2649 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.3 >Reporter: Magnus Bergmark >Priority: Minor > > Hypothetical scenario: > 1. User searches for "stocks oil gold" with MM set to "50%" > 2. User adds "-stockings" to the query: "stocks oil gold -stockings" > 3. User gets no hits since MM was ignored and all terms where AND-ed > together > The behavior seems to be intentional, although the reason why is never > explained: > // For correct lucene queries, turn off mm processing if there > // were explicit operators (except for AND). > boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; > (lines 232-234 taken from > tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java) > This makes edismax unsuitable as an replacement to dismax; mm is one of the > primary features of dismax. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser
[ https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200079#comment-13200079 ] Hoss Man commented on SOLR-2368: bq. If/when eDismax can be configured to fill the original role of DisMax, why should we maintain the old one? my chief concerns -- as i mentioned -- are that _currently_ edismax has behavior dismax doesn't support that people may actively *not* want, and that edismax may have quirks dismax doesn't that we have yet to discover and don't realize because the overall test coverage is low and the EDismaxQParse is so much more significantly complex and there are so many weird edge cases. But sure: if SOLR-3086 makes it possible to configure EDisMaxQParser to behave the same as DisMaxQParser, and if we feel confident through testing that (when configured as such) they behave the same, i've won't have any objections what soever to retiring the DisMaxQParser class for simplifying code maintence. bq. Personally I don't think we should worry about the added features after edismax becomes dismax. this part i don't understand ... even if all of the functionality ultimately merges and only the EDisMaxQparser remains, why should defType=dismax and defType=edismax suddenly become the smae thing? why not offer two instances by default, "edismax" which is open and everything defaults to on, and "dismax" where it's more locked down like it is today? ... what is gained by changing the default behavior when people use "defType=dismax"? (as i said before (in a slightly diff way above): would you suggest that defType=lucene should now be an EDisMaxQparser instance as well? with a CHANGES.txt note telling people that if they only want features LuceneQParser supported, they have to add invariant params to disable them) > Improve extended dismax (edismax) parser > > > Key: SOLR-2368 > URL: https://issues.apache.org/jira/browse/SOLR-2368 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Labels: QueryParser > > This is a "mother" issue to track further improvements for eDismax parser. > The goal is to be able to deprecate and remove the old dismax once edismax > satisfies all usecases of dismax. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200067#comment-13200067 ] Hoss Man commented on SOLR-3026: bq. If we aim to let edismax replace dismax, people may want it to behave like dismax out of the box I don't think that should be the goal. plenty of people are using "edismax" already because they like the fact that it is a super set of the dismax & lucene features, and the defaults for "edismax" should embrace that. if/when EDisMaxQParser reaches the point that it can be configured to work exactly the same as DisMaxQParser, then it may be worth considering defaulting "dismax" => an EDisMaxQParser instance configured that way, but that doesn't mean "edismax" shouldn't expose all of it's bells and whistles by default. uf=* as a default should be fine -- the only reason to question it would be if it was hard to disable, but the "-*" syntax is so easy it's not worth worrying about it. > eDismax: Locking down which fields can be explicitly queried (user fields aka > uf) > - > > Key: SOLR-3026 > URL: https://issues.apache.org/jira/browse/SOLR-3026 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5 >Reporter: Jan Høydahl >Assignee: Jan Høydahl > Fix For: 3.6, 4.0 > > Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, > SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch > > > We need a way to specify exactly what fields should be available to the end > user as fielded search. > In the original SOLR-1553, there's a patch implementing "user fields", but it > was never committed even if that issue was closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3028) Support for additional query operators (feature parity request)
[ https://issues.apache.org/jira/browse/SOLR-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1310#comment-1310 ] Hoss Man commented on SOLR-3028: #1) you can either index both stemmed and non-stemed in diff fields, and then specify the appropriate field name at query time for each input word to control what gets queried, or something like SOLR-2866 would be needed along with additional filters to record in the terms whether it's stemmed/unstemmed (possible with the payload?) so it's available at query time #2) already possible with the standard lucene syntax: "cat doc goat"~15 #3) is already possible on trunk with the surround parser (SOLR-2703) -- although there isn't a lot of documentation out there about the syntax... {code} {!surround}(this W that) AND (other W next) {code} ...it seems like the only real missing piece is some query side support for SOLR-2866, and it seems like that would best be tracked in SOLR-2866 right? ... make sure everything works all the way through the system? > Support for additional query operators (feature parity request) > --- > > Key: SOLR-3028 > URL: https://issues.apache.org/jira/browse/SOLR-3028 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Mike > Labels: operator, queryparser > Original Estimate: 6h > Remaining Estimate: 6h > > I'm migrating my system from Sphinx Search, and there are a couple of > operators that are not available to Solr, which are available in Sphinx. > I would love to see the following added to the Dismax parser: > 1. Exact match. This might be tricky to get right, since it requires work on > the index side as well[1], but in Sphinx, you can do a query such as [ > =running walking ], and running will have stemming off, while walking will > have it on. > 2. Term quorum. In Sphinx and some commercial search engines (like Recommind, > Westlaw and Lexis), you can do a search such as [ (cat dog goat)/15 ], and > find the three words within 15 terms of each other. I think this is possible > in the backend via the span query, but there's no front end option for it, so > it's quite hard to reveal to users. > 3. Word order. Being able to say, "this term before that one, and this other > term before the next" is something else in Sphinx that span queries support, > but is missing in the query parser. Would be great to get this in too. > These seem like the three biggest missing operators in Solr to me. I would > love to help move these forward if there is any way I can help. > [1] At least, *I* think it does. There's some discussion of one way of doing > exact match like support in SOLR-2866. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3093) Remove unused features and
[ https://issues.apache.org/jira/browse/SOLR-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199930#comment-13199930 ] Hoss Man commented on SOLR-3093: I think yonik's point is that unlike things in SOLR-1052 where existing users would have a reasonable expectation that the syntax would definitively do something (ie: use specific classes/settings), the config in this issue was _always_ just an optimization hint, and the system ultimately works fine even if/when it is ignored. Personally i think that in these cases, it would be sufficient to WARN that these optimization hints are no longer used and being ignored so people can clean up if/when they want, but since they don't *have* to change anything to have a working solr instance (that still externally behaves the way it would in older versions of solr) there's no reason to FAIL and annoy them. > Remove unused features and > --- > > Key: SOLR-3093 > URL: https://issues.apache.org/jira/browse/SOLR-3093 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl > Fix For: 3.6, 4.0 > > > SolrConfig.java still tries to parse > But the only user of this param was SolrIndexSearcher.java line 366-381 which > is commented out. > Probably the whole logic should be ripped out, and we fail hard if we find > this config option in solrconfig.xml > Also, the config option is old and no longer used or needed? > There is some code which tries to use it but I believe that since 1.4 there > are more efficient ways to do the same. Should we also fail-fast if found in > config or only print a warning? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199076#comment-13199076 ] Hoss Man commented on SOLR-3047: I can't make heads of tails of this bug report ... at a minimum we need to see... * what the full request params look like for an example request * what the debugQuery output looks like for an example request (including the echoParams and query parsing info * how the requesthandler in use is configured * the fieled and filedtype information for every field used by dismax (ie: mentioned in the request params or request handler defaults) > DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses > KeywordTokenizer instead of StandardTokenizer or Whitespace > --- > > Key: SOLR-3047 > URL: https://issues.apache.org/jira/browse/SOLR-3047 > Project: Solr > Issue Type: Bug >Reporter: Antony Stubbs > > Has this got something to do with the minimum clause = 2 part in the code? It > drops it without warning - IMO it should error out if the field isn't > compatible. > If it is on purpose - i don't see why. I split with the ngram token filter, > so there is def more than 1 clause in the indexed field. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue
[ https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198361#comment-13198361 ] Hoss Man commented on SOLR-3085: bq. Then, one way may be to subtract the MM count accordingly, so that in our case above, when we detect that the DisMax clause for "the" does not contain "title_en", we do mm=mm-1 which will give us an MM of 1 instead of 2 and we'll get hits. This is probably the easiest solution. that wouldn't make any sense ... in your example that would result in the query matching every doc containing "alltags:the" (or "title_en:contract", or "alltags:contract") which hardly seems like what the user is likely to expect if they used mm=100% (with or w/o a "mm.sw=false" param) bq. Another way would be to keep mm as is, and move the affected clause out of the BooleanQuery and add it as a BoostQuery instead? something like that might work .. but i haven't thought it through very hard ... i have a nagging feeling that there are non-stopword cases that would be indistinguishable (to the parser) from this type of stopword case, and thus would also trigger this logic undesirably, but i can't articulate what they might be off the top of my head. > Fix the dismax/edismax stopwords mm issue > - > > Key: SOLR-3085 > URL: https://issues.apache.org/jira/browse/SOLR-3085 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Jan Høydahl > Labels: MinimumShouldMatch, dismax, stopwords > Fix For: 3.6, 4.0 > > > As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here > http://search-lucene.com/m/Yne042qEyCq1 and here > http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if > not all fields used in QF have exactly same stopword lists. > Typical solution is to not use stopwords or harmonize stopword lists across > all fields in your QF, or relax the MM to a lower percentag. Sometimes these > are not acceptable workarounds, and we should find a better solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2368) Improve extended dismax (edismax) parser
[ https://issues.apache.org/jira/browse/SOLR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198340#comment-13198340 ] Hoss Man commented on SOLR-2368: bq. are there any blockers left for retiring the old dismax parser? As i've mentioned before, I don't think DismaxQParser should ever be retired ... i'm still not convinced that the (default) parser you get when using "defType=dismax" should change to ExtendedDismaxQParser instance My three main reasons for (still) feeling this way are: * I see no advantage to changing what QParser you get (by default) when asking for "dismax" ... not when it's so easy for new users (or old users who want to switch) to just use "edismax" by name. (or explicitly declare their own instance of ExtendedDismaxQParser with the name "dismax" if that's what they always want) * ExtendedDismaxQParser is a significantly more complex beast then DismaxQParser, and likely to have a lot of little quirks (and bugs) that no one has really noticed yet. For people who are happy with DismaxQParser, we should leave well enough alone. * Even with things like SOLR-3026 allowing you to disable field specific queries, ExtendedDismaxQParser still supports more types of queries/syntax then DismaxQParser (ie: fuzzy queries, prefix queries, wildcard queries, etc...) which may have performance impacts on existing dismax users, many of whom probably don't want to start allowing from their users -- particularly considering that limited syntax w/o metacharacters was a major advertised advantage of using dismax from day 1. Please note: i have no tangible objection to smiley's suggestion that... bq. defType should default to ... [edismax] in Solr 4 ...if folks think that the ExtendedDismaxQParser would make a better default then the LuceneQParser moving forward, i've got no objection to that -- but if someone explicitly asks for "defType=dismax" by name, that should be the DismaxQParser (and it's limited syntax) ... ExtendedDismaxQParser is a completely different animal. saying defType=dismax should return an ExtendedDismaxQParser makes as much sense to me as saying that defType=lucene should return an ExtendedDismaxQParser -- just because the legal syntax of edismax is a super set of dismax/lucene doesn't mean they are equivalent or that we should assume "it's better" for people who ask for a specific QParser by name. > Improve extended dismax (edismax) parser > > > Key: SOLR-2368 > URL: https://issues.apache.org/jira/browse/SOLR-2368 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Labels: QueryParser > > This is a "mother" issue to track further improvements for eDismax parser. > The goal is to be able to deprecate and remove the old dismax once edismax > satisfies all usecases of dismax. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue
[ https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198104#comment-13198104 ] Hoss Man commented on SOLR-3085: bq. So we get a required DisMax Query for alltags:the which does not match any docs. I think you are missreading that output... {code{ +( ( DisjunctionMaxQuery((alltags:the)~0.01) DisjunctionMaxQuery((title_en:contract | alltags:contract)~0.01) )~2 ) {code} The "DisjunctionMaxQuery((alltags:the)~0.01)" clause is not required in that query. it is one of two SHOULD clauses in a boolean query, and becomes subject to the same "mm" rule. both clauses in that BooleanQuery are already SHOULD clauses, so i don't know what it would mean to make then more "optional". > Fix the dismax/edismax stopwords mm issue > - > > Key: SOLR-3085 > URL: https://issues.apache.org/jira/browse/SOLR-3085 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Jan Høydahl > Labels: MinimumShouldMatch, dismax, stopwords > Fix For: 3.6, 4.0 > > > As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here > http://search-lucene.com/m/Yne042qEyCq1 and here > http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if > not all fields used in QF have exactly same stopword lists. > Typical solution is to not use stopwords or harmonize stopword lists across > all fields in your QF, or relax the MM to a lower percentag. Sometimes these > are not acceptable workarounds, and we should find a better solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3033) "numberToKeep" on replication handler does not work with "backupAfter"
[ https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198076#comment-13198076 ] Hoss Man commented on SOLR-3033: bq. Without a way to declare a default in solrconfig.xml, the user has no way to use this parameter should a backup be triggered by "backupAfter". right -- my point is we already have a convention for specifying "default params for a request handler" but your patch doesn't use that convention. bq. We don't have a section for request parameters, do we? Any request handler that subclasses RequestHandlerBase automatically gets defaults applied when handleRequest is called if they are specified in the configs (the syntax isn't "" it's ".) bq. If we kept it as a request-param only, but then let the user specify defaults, would that create a legal and section nested within and , so users can specify defaults for each? I'm not sure that would really make sense .. what if an instances was acting as a repeater so it's both a master and a slave? if you told it to create a backup, how many would it keep if there was a differnet value specified in the master/slave sections? I think maybe you've hit the nail on the head here... {quote} And looking at the available request parameters, we probably wouldn't want defaults for any of them ... This makes me wonder if my first try was a mistake. Possibly this should only be an init-param. {quote} So perhaps the way forward is... * keep the "numberToKeep" request param around for backcompat with Solr 3.5 for people who want to manually specify it when triggering command=backups * add a new init param for ReplicationHandler to specify how many backups to keep when backups are made -- the name for this new param should probably _not_ be numberToKeep (suggestion: "maxNumberOfBackups") because: ** we need a name that clarifies it's specific to backups ** we want a name that is distinct from the request param so in docs it's clear which one is being refered to * document clearly the interaction between the maxNumberOfBackups init param and the numberToKeep request param (suggestion: "the numberToKeep request param can be used with the backup command unless the maxNumberOfBackups init param has been specified on the handler -- in which case maxNumberOfBackups is always used and attempts to use the numberToKeep request param will cause an error" what do you think? > "numberToKeep" on replication handler does not work with "backupAfter" > -- > > Key: SOLR-3033 > URL: https://issues.apache.org/jira/browse/SOLR-3033 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 3.5 > Environment: openjdk 1.6, linux 3.x >Reporter: Torsten Krah > Attachments: SOLR-3033.patch > > > Configured my replication handler like this: > > >startup >commit >optimize > name="confFiles">elevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt >optimize >1 > > > So after optimize a snapshot should be taken, this works. But numberToKeep is > ignored, snapshots are increasing with each call to optimize and are kept > forever. Seems this settings have no effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1052) Deprecate/Remove in favor of in solrconfig.xml
[ https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197487#comment-13197487 ] Hoss Man commented on SOLR-1052: bq. This is how config deprecations should work in my opinion. No need to advertise to new users the use of a syntax that we want to go away. It would be more confusing for the C) people to see deprecation warnings being printed OOTB from their brand new search engine without knowing how to fix it +1 Patch looks good to me. my one suggestion, since there seems to be consensus that solr should complain louder when there are config errors, is that instead of removing the existing "warn" calls on those already deprecated Solr 1.x legacy conf syntax, why not leave in those checks but the "warn(...)" calls with "throw new SolrException(...)" ? > Deprecate/Remove in favor of in solrconfig.xml > -- > > Key: SOLR-1052 > URL: https://issues.apache.org/jira/browse/SOLR-1052 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Jan Høydahl >Priority: Minor > Labels: solrconfig.xml > Fix For: 3.6, 4.0 > > Attachments: SOLR-1052-3x.patch > > > Given that we now handle multiple cores via the solr.xml and the discussion > around and at > http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults > We should deprecate/remove the use of indexDefaults and just rely on > mainIndex, as it doesn't seem to serve any purpose and is confusing to > explain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3033) numberToKeep on replication handler does not work - snapshots are increasing beyond configured maximum
[ https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197387#comment-13197387 ] Hoss Man commented on SOLR-3033: James: two things to think about. 1) when adding new test configs, try to keep them as minimal as possible, so the only things in them are things that *have* to be there for the purposes of the test. 2) there are really two types of "params" when dealing with request handlers -- init params (ie: things in the body of the requestHandler tag in solrconfig.xml) and request params (things passed to the handler when it is executed. via RequestHandlerBase many request handlers support the idea of _init_ params named "defaults", "invariants" and "appends" which can contain sub-params that are consulted when parsing/processing _request_ params in handleRequest. In the case of the "numberToKeep", this is already a _request_ param, and ReplicationHandler already subclasses RequestHandlerBase which means people can define a "defaults" section in their ReplicationHandler config so any requests to "http://master_host:port/solr/replication?command=backup"; get that value automaticly. but your patch seems to add support for an _init_ param with the same name: which raises questions like "what happens if i specify differnet values for numberToKeep in init params and in invariant params?" it seems like the crux of the problem is that if you use the "backupAfter" option, the code path to create the backup bypasses a lot of the logic that is normally used when a backup command is processed via handleRequest. So instead of adding an init param version of numberToKeep, perhaps it owuld be better if the "backupAfter" codepath followed the same code path as handleRequest as much as possible? perhaps it could be something as straight forward as changing the meat of getEventListener to look like... {code} SolrQueryRequest req = new LocalSolrRequest(core ...); try { RequestHandler.this.handleRequest(req, new SolrQueryResponse()); } finally { req.close(); } {code} what do you think? > numberToKeep on replication handler does not work - snapshots are increasing > beyond configured maximum > -- > > Key: SOLR-3033 > URL: https://issues.apache.org/jira/browse/SOLR-3033 > Project: Solr > Issue Type: Bug > Components: replication (java) >Affects Versions: 3.5 > Environment: openjdk 1.6, linux 3.x >Reporter: Torsten Krah > Attachments: SOLR-3033.patch > > > Configured my replication handler like this: > > >startup >commit >optimize > name="confFiles">elevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt >optimize >1 > > > So after optimize a snapshot should be taken, this works. But numberToKeep is > ignored, snapshots are increasing with each call to optimize and are kept > forever. Seems this settings have no effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3060) add highlighter support to SurroundQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197168#comment-13197168 ] Hoss Man commented on SOLR-3060: This patch is straight forward and includes tests (thank you so much for the tests). The meat of the change is that getHighlightQuery is overridden to attempt query rewriting, which gives me two concerns... 1) at a minimum i'm pretty sure super.getHighlightQuery still needs to be called. 2) is this rewriting of hte query done in the SurroundQParser going to cause any problems or unexpected behavior in conjunction with the highlighter component logic that already decides if/when to rewrite the query? If the crux of the problem is that HighlightComponent rewrites the query automatically _except_ when using the phrase highlighter with the multi-term option (assuming i'm reading the code correctly) then shouldn't that code path of the highlighter be modified to do something sane with any type of Query object? ... why isn't it responsible for calling rewrite on any sub-query of a type it doesn't understanding? (Highlighting is one of the areas of Lucene/Solr that frequently makes my head hurt, so forgive me if these are silly questions) > add highlighter support to SurroundQParserPlugin > - > > Key: SOLR-3060 > URL: https://issues.apache.org/jira/browse/SOLR-3060 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 4.0 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3060.patch, SOLR-3060.patch > > > Highlighter does not recognize SrndQuery family. > http://search-lucene.com/m/FuDsU1sTjgM > http://search-lucene.com/m/wD8c11gNTb61 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3034) replicateAfter optimize not working
[ https://issues.apache.org/jira/browse/SOLR-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197119#comment-13197119 ] Hoss Man commented on SOLR-3034: FWIW: replicateAfter=commit is a super set of replicateAfter=optimize because every optimize command is also a commit commant ... if you are replicating after every commit then you are automatically replicating after every optimize. that's the reason the UI only shows "commit, startup", it's "deduping" the list. However: that wouldn't explain why you aren't seeing replication happen after an optimize. Are you seeing replication happen at all? I just tried a quick sanity check using trunk r1237878 with the example modified to act as a master with replicateAfter commit & optimize and i was definitely seeing http://localhost:8983/solr/replication?command=indexversion return a new indexversion after every commit or optimize. when i changed the config to *only* use replicateAfter=optimize, then indexversion would return a new version after every optimize command, but not after every commit ...so things are working exactly as expected on the master side from what i can see. > replicateAfter optimize not working > --- > > Key: SOLR-3034 > URL: https://issues.apache.org/jira/browse/SOLR-3034 > Project: Solr > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Antony Stubbs > > I have: > {noformat} > optimize > commit > startup > {noformat} > But the UI only shows: > {noformat} > replicateAfter:commit, startup > {noformat} > And sure enough, optimizing does not cause a replication to happen. > Also, replicating an optimized index, does not seem to keep in "optimized" on > the slave. Is that really the case, or is it a bug? I would expect if an > index is optimized on master, when it is then replicated to slaves, the > slaves would receive the optimized index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org