[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855079#action_12855079 ] Hoss Man commented on SOLR-1163: I vote for a separate war: primarily because i think it would be a good way to encourage/force us to make sure that all the functionality needed to power a good GUI is exposed via RequestHandlers, (wheich will help make it easy for other people to write their own custom tools for controlling solr in non-standartd ways). If it lives in the same war, it's too easy to just directly access public Java level APIs that don't have an HTTP corollary. That said: i don't see any downside to it being a contrib living right in the solr code base. Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1801) Delete record by id which is inserted using dedupe processor
[ https://issues.apache.org/jira/browse/SOLR-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853445#action_12853445 ] Hoss Man commented on SOLR-1801: I'm really not sure what the 'bug here is. There are lots of use cases where delete by Id isn't practical -- this is just one more of those cases. Perhaps you should start a thread on solr-user explaining your full usecase a little better, and clarifying what it is you want/need/expect to do when using deduplication that you don't feel you can do right now. Delete record by id which is inserted using dedupe processor Key: SOLR-1801 URL: https://issues.apache.org/jira/browse/SOLR-1801 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.4 Reporter: Subroto Sanyal Fix For: 1.5 A record added with unique key generated by dedupe processor can't be deleted using delete by id as the id is generated by hashing and is unknown to the user. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1553) extended dismax query parser
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853447#action_12853447 ] Hoss Man commented on SOLR-1553: Jonathan: looking at the code, it seems completely plausible, ant that's the direction i was going down with that previous patch -- but i got hung up on the fact that for reasons i couldn't identify, clauses refering to fieldnames that don't exist in the schema are getting dropped -- need to track down where that is happening and stop it, so the new code can look at those field names and treat them as aliases (to resolve to other fields) I just need to find time to dig into it more -- but if you want to take a swing at fixing edismax.unescapedcolon.bug.test.patch and then improving on edismax.userFields.patch, by all means be my guest. extended dismax query parser Key: SOLR-1553 URL: https://issues.apache.org/jira/browse/SOLR-1553 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Fix For: 1.5 Attachments: edismax.unescapedcolon.bug.test.patch, edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch An improved user-facing query parser based on dismax -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1860) improve stopwords list handling
[ https://issues.apache.org/jira/browse/SOLR-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853679#action_12853679 ] Hoss Man commented on SOLR-1860: bq. I like the idea of an export - it's transparent and neatly handles back compat concerns. that's the same conclusion robert and i came to on IRC ... being able to load directly sounds less redundent, but as soon as a user wants to customize (and let's face it: stop words can easily be domain specific) qe need a way of exporting that's convenient even for novice users who don't know anything about jars and wars. bq. Not sure at this point if it makes more sense trying to put a text_fr, etc, in the normal schema.xml or in a separate schema_intl.xml. The idea robert pitched on IRC was to create a new example solr-instance directory with a barebones solrconfig.xml file, and a schema.xml file that *only* demonstrated fields using various tricks for various lanagues. All the language specific stopword files would then live in this new instancedir. The idea being that people interested in non-english fields, could find a recommended fieldtype declaration in this schema.xml file, and cut/paste it to their schema.xml (probably copied from the main example) The key here being that we don't want an entire clone of the example (all the numeric fields, and multiple request handler declarations,etc...) this will just show the syntax for declaring all the various langages that we can provide suggestions for. bq. As far as file format: I think we sould also support the snowball stopword format. Agreed, but it's a trivially minor chicken/egg choice. Either we can setup a simple export and conversion to the format Solr currently supports now, and if/when someon updates StopFilterFactory to support the new format, then we can stop converting when we export; or we can modify StopFilter to support both formats first, and then setup the simple export w/o worrying about conversion. Frankly: If Robert's planning on doing the work either way, I'm happy to let him decide which approach makes the most sense. improve stopwords list handling --- Key: SOLR-1860 URL: https://issues.apache.org/jira/browse/SOLR-1860 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 3.1 Reporter: Robert Muir Assignee: Robert Muir Priority: Minor Currently Solr makes it easy to use english stopwords for StopFilter or CommonGramsFilter. Recently in lucene, we added stopwords lists (mostly, but not all from snowball) to all the language analyzers. So it would be nice if a user can easily specify that they want to use a french stopword list, and use it for StopFilter or CommonGrams. The ones from snowball, are however formatted in a different manner than the others (although in Lucene we have parsers to deal with this). Additionally, we abstract this from Lucene users by adding a static getDefaultStopSet to all analyzers. There are two approaches, the first one I think I prefer the most, but I'm not sure it matters as long as we have good examples (maybe a foreign language example schema?) 1. The user would specify something like: filter class=solr.StopFilterFactory fromAnalyzer=org.apache.lucene.analysis.FrenchAnalyzer .../ This would just grab the CharArraySet from the FrenchAnalyzer's getDefaultStopSet method, who cares where it comes from or how its loaded. 2. We add support for snowball-formatted stopwords lists, and the user could something like: filter class=solr.StopFilterFactory words=org/apache/lucene/analysis/snowball/french_stop.txt format=snowball ... / The disadvantage to this is they have to know where the list is, what format its in, etc. For example: snowball doesn't provide Romanian or Turkish stopword lists to go along with their stemmers, so we had to add our own. Let me know what you guys think, and I will create a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1842) DataImportHandler ODBC keeps lock on the source table while optimisatising is being run...
[ https://issues.apache.org/jira/browse/SOLR-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851945#action_12851945 ] Hoss Man commented on SOLR-1842: Yeah, i don't know anything about ODBC, but it seems odd that DIH wouldn't commit any transactions it opens to release the table locks. (unless this is something to do with auto generated transactions in the ODBC Connector) Marcin: it woul be helpful if you could provide a specific example of a DIH config in which you see this problem (the simpler the better) ... perhaps you are using some feature of DIH in a way that is unexpected and that's why the table locks are living longer then they should. DataImportHandler ODBC keeps lock on the source table while optimisatising is being run... -- Key: SOLR-1842 URL: https://issues.apache.org/jira/browse/SOLR-1842 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.5 Reporter: Marcin Hi Guys, I don't know if its really a bug but I think its quite good place for it. The problem is with dataImportHandler and DB queries. For example: Let's have a big table which keeps docs to being indexed, we are running query against it on a datimporthandler and query locks table which is quite obvius and desire behaviour from the SQL points of view but while optimisation is being done its should not allow to issue query because in that case table is being locked till optimisation process will finish which can take a time... As a workaround you can use select SQL_BUFFER_RESULT... statment which will move everything into temp table and release all locks but still dataImportHandlerwill be waiting for optimisation to finish. Which means you will be able to insert new docs into main table at least. cheers -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1848) Add example Query page to the example
[ https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851092#action_12851092 ] Hoss Man commented on SOLR-1848: I'm confused by yonik's comment... bq .What's the motivation for including them in the solr webapp? I agree, adding things to solr.war just for the purpose of the example/tutorial is a bad idea, but from what i can tell Grant's commit didn't do that -- it just added configuration so that people running java -jar start.jar had both the solr webapp running as well as a static webapp containing a form. if they copied the solr.war file, or the example/solr home they wouldn't be affected at all. I suppose for people who copy the *entire* example directory there might be some unnecessary stuff -- but that's going to always be true (unless we get rid of exampledocs) Frankly though: the queries.html is so simple, i really don't understand why we wouldn't just expand the tutorial to include those links. Add example Query page to the example - Key: SOLR-1848 URL: https://issues.apache.org/jira/browse/SOLR-1848 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Trivial I've wired up a static jetty context and hooked in a simple HTML page that shows off a bunch of the different types of queries people can do w/ the Example data. Browse to it at http://localhost:8983/example/queries.html Will commit shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1848) Add example Query page to the example
[ https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851131#action_12851131 ] Hoss Man commented on SOLR-1848: Agreed: as we start adding sections, it will maek a lot of sense to split the tutorial out into multiple pages: the (existing) intro page shoiwing how easy it is to load data and do basic queries w/faceting and highlighting, a second page showing off spatial queries, a third page showing spell check (and myabe more like this), DIH should have a page, etc... With the possible exception of distributed search (where multiple ports need to be up and running) there's no reason all of these things can't be demoed from a single example. Add example Query page to the example - Key: SOLR-1848 URL: https://issues.apache.org/jira/browse/SOLR-1848 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Trivial I've wired up a static jetty context and hooked in a simple HTML page that shows off a bunch of the different types of queries people can do w/ the Example data. Browse to it at http://localhost:8983/example/queries.html Will commit shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (SOLR-1672) RFE: facet reverse sort count
[ https://issues.apache.org/jira/browse/SOLR-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reopened SOLR-1672: reopening ... not sure why it was marked resolved RFE: facet reverse sort count - Key: SOLR-1672 URL: https://issues.apache.org/jira/browse/SOLR-1672 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Java, Solrj, http Reporter: Peter Sturge Priority: Minor Attachments: SOLR-1672.patch Original Estimate: 0h Remaining Estimate: 0h As suggested by Chris Hosstetter, I have added an optional Comparator to the BoundedTreeSetLong in the UnInvertedField class. This optional comparator is used when a new (and also optional) field facet parameter called 'facet.sortorder' is set to the string 'dsc' (e.g. f.facetname.facet.sortorder=dsc for per field, or facet.sortorder=dsc for all facets). Note that this parameter has no effect if facet.method=enum. Any value other than 'dsc' (including no value) reverts the BoundedTreeSet to its default behaviour. This change affects 2 source files: UnInvertedField.java [line 438] The getCounts() method signature is modified to add the 'facetSortOrder' parameter value to the end of the argument list. DIFF UnInvertedField.java: - public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int offset, int limit, Integer mincount, boolean missing, String sort, String prefix) throws IOException { + public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int offset, int limit, Integer mincount, boolean missing, String sort, String prefix, String facetSortOrder) throws IOException { [line 556] The getCounts() method is modified to create an overridden BoundedTreeSetLong(int, Comparator) if the 'facetSortOrder' parameter equals 'dsc'. DIFF UnInvertedField.java: - final BoundedTreeSetLong queue = new BoundedTreeSetLong(maxsize); + final BoundedTreeSetLong queue = (sort.equals(count) || sort.equals(true)) ? (facetSortOrder.equals(dsc) ? new BoundedTreeSetLong(maxsize, new Comparator() { @Override public int compare(Object o1, Object o2) { if (o1 == null || o2 == null) return 0; int result = ((Long) o1).compareTo((Long) o2); return (result != 0 ? result 0 ? -1 : 1 : 0); //lowest number first sort }}) : new BoundedTreeSetLong(maxsize)) : null; SimpleFacets.java [line 221] A getFieldParam(field, facet.sortorder, asc); is added to retrieve the new parameter, if present. 'asc' used as a default value. DIFF SimpleFacets.java: + String facetSortOrder = params.getFieldParam(field, facet.sortorder, asc); [line 253] The call to uif.getCounts() in the getTermCounts() method is modified to pass the 'facetSortOrder' value string. DIFF SimpleFacets.java: - counts = uif.getCounts(searcher, base, offset, limit, mincount,missing,sort,prefix); + counts = uif.getCounts(searcher, base, offset, limit, mincount,missing,sort,prefix, facetSortOrder); Implementation Notes: I have noted in testing that I was not able to retrieve any '0' counts as I had expected. I believe this could be because there appear to be some optimizations in SimpleFacets/count caching such that zero counts are not iterated (at least not by default) as a performance enhancement. I could be wrong about this, and zero counts may appear under some other as yet untested circumstances. Perhaps an expert familiar with this part of the code can clarify. In fact, this is not such a bad thing (at least for my requirements), as a whole bunch of zero counts is not necessarily useful (for my requirements, starting at '1' is just right). There may, however, be instances where someone *will* want zero counts - e.g. searching for zero product stock counts (e.g. 'what have we run out of'). I was envisioning the facet.mincount field being the preferred place to set where the 'lowest value' begins (e.g. 0 or 1 or possibly higher), but because of the caching/optimization, the behaviour is somewhat different than expected. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1672) RFE: facet reverse sort count
[ https://issues.apache.org/jira/browse/SOLR-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849755#action_12849755 ] Hoss Man commented on SOLR-1672: Some old notes on this patch that i just found on my laptop (presumably from the last time i was on a plane) ... * The existing patch is in a weird format that i coulnd't apply * re-reading the patch, and comparing to the SimpleFacets and UnInvertedField source, i'm noticing some that several code paths for facet counts aren't being accounted for * I think what we should do conceptially is refactor all of the code that looks at the existing FacetParams.FACET_SORT param (or any of the constant values for it) into a helper function thta parses the new legal values we want to support and returns a Comparator, and then start passing that comparator arround to the various strategies (termenum, fieldcache, uninverted) for collecting facet constraints, instead of just passing arround the sort string value... ** true,count,count desc = a comparator that does descending count sort ** count asc = a comparator that does ascending count sort ** false,index,index asc = null (by returning a null comparator we would be signalling that no sorting or bounded collection is needed, terms can be processed in order) ** index desc = a comparater that does descendeing term sort (not requested in this Jira, but recently asked about on the mailing list) * The problem with that conceptual solution is that UnInvertedField doesn't maintain a BoundedTreeSet of CountPairs like all of hte other code paths, it uses a single Long to encode both the count and the index of the term, so it would need some special logic. ** Side question: I wonder if that Long encoded format would work for hte field cache based faceting as well? RFE: facet reverse sort count - Key: SOLR-1672 URL: https://issues.apache.org/jira/browse/SOLR-1672 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Java, Solrj, http Reporter: Peter Sturge Priority: Minor Attachments: SOLR-1672.patch Original Estimate: 0h Remaining Estimate: 0h As suggested by Chris Hosstetter, I have added an optional Comparator to the BoundedTreeSetLong in the UnInvertedField class. This optional comparator is used when a new (and also optional) field facet parameter called 'facet.sortorder' is set to the string 'dsc' (e.g. f.facetname.facet.sortorder=dsc for per field, or facet.sortorder=dsc for all facets). Note that this parameter has no effect if facet.method=enum. Any value other than 'dsc' (including no value) reverts the BoundedTreeSet to its default behaviour. This change affects 2 source files: UnInvertedField.java [line 438] The getCounts() method signature is modified to add the 'facetSortOrder' parameter value to the end of the argument list. DIFF UnInvertedField.java: - public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int offset, int limit, Integer mincount, boolean missing, String sort, String prefix) throws IOException { + public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int offset, int limit, Integer mincount, boolean missing, String sort, String prefix, String facetSortOrder) throws IOException { [line 556] The getCounts() method is modified to create an overridden BoundedTreeSetLong(int, Comparator) if the 'facetSortOrder' parameter equals 'dsc'. DIFF UnInvertedField.java: - final BoundedTreeSetLong queue = new BoundedTreeSetLong(maxsize); + final BoundedTreeSetLong queue = (sort.equals(count) || sort.equals(true)) ? (facetSortOrder.equals(dsc) ? new BoundedTreeSetLong(maxsize, new Comparator() { @Override public int compare(Object o1, Object o2) { if (o1 == null || o2 == null) return 0; int result = ((Long) o1).compareTo((Long) o2); return (result != 0 ? result 0 ? -1 : 1 : 0); //lowest number first sort }}) : new BoundedTreeSetLong(maxsize)) : null; SimpleFacets.java [line 221] A getFieldParam(field, facet.sortorder, asc); is added to retrieve the new parameter, if present. 'asc' used as a default value. DIFF SimpleFacets.java: + String facetSortOrder = params.getFieldParam(field, facet.sortorder, asc); [line 253] The call to uif.getCounts() in the getTermCounts() method is modified to pass the 'facetSortOrder' value string. DIFF SimpleFacets.java: - counts = uif.getCounts(searcher, base, offset, limit, mincount,missing,sort,prefix); + counts = uif.getCounts(searcher, base, offset, limit, mincount,missing,sort,prefix, facetSortOrder); Implementation Notes: I have noted in testing that I was not able to retrieve any '0' counts as I had expected. I believe this could be because there
[jira] Created: (SOLR-1846) Remove support for (broken) abortOnConfigurationError
Remove support for (broken) abortOnConfigurationError - Key: SOLR-1846 URL: https://issues.apache.org/jira/browse/SOLR-1846 Project: Solr Issue Type: Improvement Reporter: Hoss Man Setting abortOnConfigurationError==false has not worked for some time, and based on a POLL of existing users, no one seems to need/want it, so we should remove support for it completely to make error handling and reporting work more cleanly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1846) Remove support for (broken) abortOnConfigurationError
[ https://issues.apache.org/jira/browse/SOLR-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1846: --- Attachment: SOLR-1846.patch Attached patch should get us to a good point to tackle some of the related issues. It updates all code paths (unless i missed one) that put something into SolrConfig.severeErrors so that that code path also explicilty throws the corrisponding exception. This seems to be working well and is a good base for building up better per-core error reporting in SolrDispatchFilter (because now all the exceptions can be propogated up to CoreContainer and tracked per core) As is, this patch breaks BadIndexSchemaTest ... and i'm not really sure what the 'right' fix is ... the test explicitly expects a bad schema.xml to be loaded properly, and then looks for 3 errors in SOlrCOnfig.severeErrors -- errors are still added to severeErrors befor getting thrown, btu the test still errors out during setUp because the SolrCore can't be inited (because the IndexSchema doesn't finishing initing) my best suggestion: split the test into three test, each using a differnet config (one per type of error tested) and assert that we get an exception during setUp. Remove support for (broken) abortOnConfigurationError - Key: SOLR-1846 URL: https://issues.apache.org/jira/browse/SOLR-1846 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-1846.patch Setting abortOnConfigurationError==false has not worked for some time, and based on a POLL of existing users, no one seems to need/want it, so we should remove support for it completely to make error handling and reporting work more cleanly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1846) Remove support for (broken) abortOnConfigurationError
[ https://issues.apache.org/jira/browse/SOLR-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1846: --- Description: Setting abortOnConfigurationError==false has not worked for some time, and based on a POLL of existing users, no one seems to need/want it, so we should remove support for it completely to make error handling and reporting work more cleanly. http://n3.nabble.com/POLL-Users-of-abortOnConfigurationError-tt484030.html#a484030 was: Setting abortOnConfigurationError==false has not worked for some time, and based on a POLL of existing users, no one seems to need/want it, so we should remove support for it completely to make error handling and reporting work more cleanly. Remove support for (broken) abortOnConfigurationError - Key: SOLR-1846 URL: https://issues.apache.org/jira/browse/SOLR-1846 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-1846.patch Setting abortOnConfigurationError==false has not worked for some time, and based on a POLL of existing users, no one seems to need/want it, so we should remove support for it completely to make error handling and reporting work more cleanly. http://n3.nabble.com/POLL-Users-of-abortOnConfigurationError-tt484030.html#a484030 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types
[ https://issues.apache.org/jira/browse/SOLR-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1832. Resolution: Won't Fix abortOnConfigurationError=false no longer works for most plugin types - Key: SOLR-1832 URL: https://issues.apache.org/jira/browse/SOLR-1832 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Hoss Man In 1.4 setting the abortOnConfigurationError config option to false only affects RequestHandlers and schema related classes (ie: FieldType and Token*Factories). ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to initalize properly in Solr 1.4 will cause the entire SolrCore to fail to initialize. This changed from previous version: In Solr 1.3 a failure to init any of these types of plugins when abortOnConfigurationError=false would result in errors being logged on init, but the SolrCore itself would still work and only attempts to use those plugins would result in an error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1834) Document level security
[ https://issues.apache.org/jira/browse/SOLR-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848946#action_12848946 ] Hoss Man commented on SOLR-1834: Anders: I only had a few moments to skim your patch, but it seems like a very cool feature, thank you (and Findwise) for contributing this. One thing i noticed was that there didn't seem to be a lot of documentation (javadoc or otherwise) ... i see that the demo application you cited seems to have some good overview documentation on how all the pieces fit together, and what configuration should look like -- if your intention is that this documentation can also be used by Solr, would you mind attaching it to the Jira issue (as HTML, or in javadoc comments on the java files themselves) with the Grant ... Apache License ... box checked off so there's a clear audit log that the documentation can be reproduced within Solr? Document level security --- Key: SOLR-1834 URL: https://issues.apache.org/jira/browse/SOLR-1834 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4 Reporter: Anders Rask Attachments: SOLR-1834.patch Attached to this issue is a patch that includes a framework for enabling document level security in Solr as a search component. I did this as a Master thesis project at Findwise in Stockholm and Findwise has now decided to contribute it back to the community. The component was developed in spring 2009 and has been in use at a customer since autumn the same year. There is a simple demo application up at http://demo.findwise.se:8880/SolrSecurity/ which also explains more about the component and how to set it up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1823) XMLWriter throws ClassCastException on writing maps other than String,?
[ https://issues.apache.org/jira/browse/SOLR-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1823. Resolution: Fixed Fix Version/s: 1.5 Assignee: Hoss Man Nice catch Frank. FWIW: the original intent was that any of those types of objects could be used as the *value* of a Map, not the key -- but that's still no excuse to just cast the key instead of using stringification (i could have sworn it was already doing that) The one subtlety that your patch broke however is that if someone uses null as a key in the Map, that has always been written out as an entry w/o a key -- but by using String.valueOf your patch allways produces a non-null string value (ie: the 4 character string null) so i modified your patch to just use toString() with an explicit null check. Committed revision 925031. XMLWriter throws ClassCastException on writing maps other than String,? - Key: SOLR-1823 URL: https://issues.apache.org/jira/browse/SOLR-1823 Project: Solr Issue Type: Improvement Components: documentation, Response Writers Reporter: Frank Wesemann Assignee: Hoss Man Fix For: 1.5 Attachments: SOLR-1823.patch http://lucene.apache.org/solr/api/org/apache/solr/response/SolrQueryResponse.html#returnable_data says that a Map containing any of the items in this list may be contained in a SolrQueryResponse and will be handled by QueryResponseWriters. This is not true for (at least) Keys in Maps. XMLWriter tries to cast keys to Strings. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1824) partial field types created on error
[ https://issues.apache.org/jira/browse/SOLR-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847166#action_12847166 ] Hoss Man commented on SOLR-1824: Scratch that -- i get it now: * IndexSchem uses anonymous subclasses of AbstractPluginLoader to instantiate a variety of differnet things * AbstractPluginLoader processes things in a loop, recording errors in SolrConfig.severeErrors when a particular instance can't be inited, but creating the rest of the objects just fine. * when abortOnConfigurationError=false this results in solr using a schema with missing filters (or missing fields, etc...) .. the only thing that protects people when abortOnConfigurationError=true is that SolrDispatchFilter pays attention to both abortOnConfigurationError and SolrConfig.severeErrors (someone using embedded Solr might never notice the error at all, even if the config did say abortOnConfigurationError=true) partial field types created on error Key: SOLR-1824 URL: https://issues.apache.org/jira/browse/SOLR-1824 Project: Solr Issue Type: Bug Affects Versions: 1.1.0 Reporter: Yonik Seeley Priority: Minor When abortOnConfigurationError=false, and there is a typo in one of the filters in a chain, the field type is still created by omitting that particular filter. This is particularly dangerous since it will result in incorrect indexing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types
abortOnConfigurationError=false no longer works for most plugin types - Key: SOLR-1832 URL: https://issues.apache.org/jira/browse/SOLR-1832 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Hoss Man In 1.4 setting the abortOnConfigurationError config option to false only affects RequestHandlers and schema related classes (ie: FieldType and Token*Factories). ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to initalize properly in Solr 1.4 will cause the entire SolrCore to fail to initialize. This changed from previous version: In Solr 1.3 a failure to init any of these types of plugins when abortOnConfigurationError=false would result in errors being logged on init, but the SolrCore itself would still work and only attempts to use those plugins would result in an error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types
[ https://issues.apache.org/jira/browse/SOLR-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847170#action_12847170 ] Hoss Man commented on SOLR-1832: This seems to be a result of switching away from the (Map|NamedList)PluginLoader classses when the PluginInfo API was added. The PluginLoaders would loop over multiple plugins recording any errors in SolrCOnfig.severErrors but then proceeding -- SolrCore.initPlugins on the other hand fails fast. abortOnConfigurationError=false no longer works for most plugin types - Key: SOLR-1832 URL: https://issues.apache.org/jira/browse/SOLR-1832 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Hoss Man In 1.4 setting the abortOnConfigurationError config option to false only affects RequestHandlers and schema related classes (ie: FieldType and Token*Factories). ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to initalize properly in Solr 1.4 will cause the entire SolrCore to fail to initialize. This changed from previous version: In Solr 1.3 a failure to init any of these types of plugins when abortOnConfigurationError=false would result in errors being logged on init, but the SolrCore itself would still work and only attempts to use those plugins would result in an error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847174#action_12847174 ] Hoss Man commented on SOLR-1817: I started looking a little more closely at the singleton SolrCOnfig.severErrors, since eliminating usage of it is really the key to being able to support abortOnConfigurationError=true/false for multiple cores independently. The more I looked at it, the more this whole thing seems futile. For starters: I discovered SOLR-1832 ... in a nutshell a failure to init most types of plugins in 1.4 causes SolrCore to fail to init, regardless of whether abortOnConfigurationError=false. The only types of plugins where initialization failures are logged but the remaining instances are loaded anyway is RequestHandlers and schema related classes (ie: FieldType and Token*Factories) ... but as noted in SOLR-1824 it's actually a really, really, REALLY bad thing for IndexSchema to ignore when a FieldType or analysis factory can't be initialized, because it could result in incorrect values getting indexed. So we could: * fix SOLR-1824 so that any init error in IndexSchema caused a hard fail. * fix SOLR-1832 so that SolrCore.initPlugins skipped any instances that failed to init and recorded the exceptions directly with the SolrCore. * officially deprecated the *PluginLoader classes, and remove the spots where it adds to SolrCOnfig.severErrors ...so then we wouldn't have anyone writting to SolrCOnfig.severErrors anymore. But should we even bother? I'm starting to think the whole idea of abortOnConfigurationError is a bad idea ... especially if no one noticed SOLR-1832 before now. Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer? Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847177#action_12847177 ] Hoss Man commented on SOLR-1817: Tangential Comment... If we *do* decide that it's worth keeping abortOnConfigurationError, then my earlier suggestion of how it should work was overly complicated... {quote} a) SolrCore should itself maintain a list of Severe Initialization Exceptions that it was able to get passed when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration. b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions. d) SolrCore should refuse to execute any requests if (a) contains Exceptions and (c) is true {quote} There's really no reason for SolrCore to maintain/expose a special list of Exceptions and fail to execute if solrconfig says it should. Instead: SolrCore can maintain a list of Exception during is initialization and then if solrconfig.xml says abortOnConfigurationError=true, the the last line of the SolrCore constructor can check check if the list is empty, and throw a nice fat SOlrException wrapping that List if it's not, which CoreContainer can keep track of (just like any solrconfig.xml or schema.xml parse exceptions that it might encounter before that.) (this would change the behavior of new SolrCore when abortOnConfigurationError=true for embedded users constructing it themselves -- but frankly i think it changes it in a good way) Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1743. Resolution: Fixed Committed revision 923909. NOTE: since this bug was introduced after 1.4, and since i expect it to get superceeded by SOLR-1817 prior to the next release, I didn't bother with a CHANGES.txt entry error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.restore14behavior.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846061#action_12846061 ] Hoss Man commented on SOLR-1817: Hey mark: i've only had a chance to skim your patch so far, and i'm still not sure if i have jury duty today, so i don't know if i'll have any time to really test it out this afternoon, but here are my quick impressions (mixed with my thoughts on how to do this before i saw your patch): 1) fundementally we have two differnet kinds of initialization exceptions -- the ones SolrCore can deal with and keep going, and the ones that are complete show stoppers. Regardless of what the abortOnServerCOnfError configuration looks like, it seems like these exceptions should be tracked separately. We should let SolrCore catch and keep track of any exceptions that it can ignore while still providing functionality; but if anything it can't deal with occurs it should just throw it and then let caller (ie: CoreContainer) keep track of it. That way SOlrCore (and the errors it's tracking) are still usable by embedded users who may not even be using coreContainer (i think there's an NPE possibility there in your current patch ... if people construct a SolrCore without a CoreDescriptor) 2) It looks like you still have SolrDispatchFilter looking at SolrConfig.severeErrors. It seems like the logic there should be something like... {code} SolrCore core = coreContainer.getCoreNyName(corepath) if (null == core) { Throwable err = coreContainer.getCoreInitError(corepath) if (null == err) { write_init_errors_for_core(corepath, err) } } if (core.abortOnConfigError() 0 core.getSevereErrors().size()) { write_init_errors_for_core(corepath, err) } {code} 3) we should think about how the no-arg behavior of the CoreAdminHandler should deal with reporting about cores that couldn't be initialized Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846217#action_12846217 ] Hoss Man commented on SOLR-1817: Some rore comments now that i've read things a little more in depth... * I should have read your comments more carefully, you already noted the remaining usages of SolrConfig.severErrors * two of the places you removed adds to SolrConfig.severErrors are in IndexSchema, where exceptions are logged, but not thrown (so your new code never sees them). ** Personally i think this is fine, because I don't think those two situations really fit the definition of a severe init error as it was designed (ie: a plugin like a request handler which might not be used in many situations can't be initialized). ** I think errors in IndexSchema init should either be fatal (ie: thrown by the constructor and prevent the core from ever working) or just logged as being bad news ** FWIW, the two code places i'm talking about are... *** when a field is declared twice *** when a dynamicfield is declared twice * why is admin suddenly a magic alias for in SolrDispatchFilter? (line 196) * the big comment about servlet container behavior if you throw an error during init doesn't make sense where you copied it (in doFilter, line 292) Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846227#action_12846227 ] Hoss Man commented on SOLR-1817: bq. this is part of the big open issue I think is left here - how to properly deal with abortOnServerConfError. Here's what i think makes the most sense in a multi-core world, and is the most in the spirit of what that options was ment to do when it was added for single cores. a) SolrCore should itself maintain a list of Severe Initialization Exceptions that it was able to get passed when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration. b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions. d) SolrCore should refuse to execute any requests if (a) contains Exceptions and (c) is true e) SolrCore should throw any exceptions it can't get passed f) CoreContainer should keep track of which core names completely failed to initialize, and what exception was encountered while trying (ie: MapSolrCore,Throwable ... no List needed). This should be the first exception involved -- even if it came from trying to instantiate the IndexSchema, or parse the solrcofig.xml file before it ever got to the SolrCore. CoreContainer shouldn't know/care about (a) or (c) g) CoreContainer should provide an easy way to query for (f) by core name h) If SolrDispatchFilter asks CoreContainer for a corename, and no SolrCore is found with that name, it should then use (g) to generate an error message i) SolrDispatchFilter shouldn't know/care about (a) or (c) ... it should just ask SolrCore to execute a request, and SolrCore should fail as needed based on it's settings (this will potentially allow things like SOLR-141 to work even with init errors, as long as the ResponseWriter was initialized successfully) j) SolrConfig.severeErrors should be deprecated, but for back-compat SolrCore and CoreContainer can add to it whenever they add an exception to their own internal state. k) CoreContainer.Initializer.*AbortOnConfigurationError should be deprecated, but can still continue to provide the same behavior they do on the trunk (ie: influence the default value for each core prior to init, and return true if any of the cores have a value of true for that property after init) l) we could concievable make solr.xml have it's own abortOnConfigError type property, but frankly i think if there are *any* errors in solr.xml, that should just be a stop the world type situation, where CoreContainer.Initializer.initialize() just throws a big fat error and CoreContainer deals with it ... i can't think of any good that could possibly come from letting solr proceed when it encounteres an error in solr.xml Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846230#action_12846230 ] Hoss Man commented on SOLR-1817: Ugh... lots of cross talk, sorry still procesisng some of your earlier comments... {quote} That check is actually just there so that if you ask for solr/admin, you will end up getting the core - so it makes sense to only allow it in if the corename is admin anyway. Though I have never really liked that logic where it looks for the admin core and when it can't find it it drops to the core. {quote} Hmmm... so then this is special behavior needed to make the admin/*.jsp type URLs work with the default core? then why was this check only added as part of your patch for this issue? how do the admin JSPs work on the current trunk? In either case: why do we need to specificly test for admin shouldn't that code path just fall through regardless of the patch? ie: {code} if (core == null errors == null) { corename=; core=cores.getCore(corename); } {code} ? Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1824) partial field types created on error
[ https://issues.apache.org/jira/browse/SOLR-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846235#action_12846235 ] Hoss Man commented on SOLR-1824: Can someone point me to a method name and/or line number? ... i'm not following what exactly is the current bug. (particularly with regards to abortOnConfigurationError=false ... nothing in IndexSchema has ever looked at that config option, so if it has any problem initing a field/fieldtype it should be throwing an exception and completly failing to initialize -- so i don't see how the problem could be any better/worse depending on the value of that option) partial field types created on error Key: SOLR-1824 URL: https://issues.apache.org/jira/browse/SOLR-1824 Project: Solr Issue Type: Bug Affects Versions: 1.1.0 Reporter: Yonik Seeley Priority: Minor When abortOnConfigurationError=false, and there is a typo in one of the filters in a chain, the field type is still created by omitting that particular filter. This is particularly dangerous since it will result in incorrect indexing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore
[ https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846237#action_12846237 ] Hoss Man commented on SOLR-1817: bq. That is the issue where that possible NPE is - getting access to the core name. Just ot be clear: it's not just the core name -- you've got code that assumes a SolrCore.getCoreDescriptor() will allways be non null, but that's not allways going to be true. Fix Solr error reporting to work correctly with multicore - Key: SOLR-1817 URL: https://issues.apache.org/jira/browse/SOLR-1817 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request). The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here. If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1803) ExtractingRequestHandler does not propagate multiple values to a multi-valued field
[ https://issues.apache.org/jira/browse/SOLR-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845691#action_12845691 ] Hoss Man commented on SOLR-1803: Lance: i agree that the current semantics are either poorly definied, or not very useful, but your suggestion seems like it overlooks what is probably the two most common cases: * to have literal values that overwrite/replace extracted values * to have literal values that act as defaults unless extracted values are found ...those seem like they should both be possible for single and multivalued fields ExtractingRequestHandler does not propagate multiple values to a multi-valued field --- Key: SOLR-1803 URL: https://issues.apache.org/jira/browse/SOLR-1803 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Reporter: Lance Norskog Priority: Minor Attachments: display-extracting-bug.patch When multiple values for one field are extracted from a document, only the last value is stored in the document. If one or more values are given as parameters, those values are all stored. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1743: --- Attachment: SOLR-1743.restore14behavior.patch Ok, I've been doing some more testing... First off: a lot of my early comments on this issue were inaccurate -- in some cases I was trying to test the behavior of trunk using a single core example with some errors in the solrconfig.xml, but i was using the example/solr dir on the trunk, and i completly forgot that it has a solr.xml file in it now. From what i can tell, the only real difference between the behavior of trunk, and the behavior of Solr 1.4 is that: in 1.4 when using legacy single core mode (ie: no solr.xml) you would get good error messages if an low level error happened that completely prevented the core from loading (ie: schema init problem, or xml parsing problem with solrconfig.xml) This is because the default behavior of abortOnConfigurationError was true for legacy single core mode, and that boolean drives SolrDispatchFilter's decision about what type of error message to display. The latest attached patch (SOLR-1743.restore14behavior.patch) should get us back to the error reporting behavior of Solr 1.4 -- i think we should go ahead and commit this to the trunk as a temporary fix for the current bug, while we flesh out improvements to the entire concept of abortOnConfigurationError in another issue. error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.restore14behavior.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1815) SolrJ doesn't preserve the order of facet queries returned from solr
[ https://issues.apache.org/jira/browse/SOLR-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1815: --- Description: Using Solrj, I wanted to sort the response of a range query based on some specific labels. For instance, using the query: {noformat} facet=true facet.query={!key= Less than 100}[* TO 99] facet.query={!key=100 - 200}[100 TO 200] facet.query={!key=200 +}[201 TO *] {noformat} I wanted to display the response in the following order: {noformat} Less than 100 (x) 100 - 200 (y) 201 + (z) {noformat} independently on the values of x, y, z which are the numbers of the retrieved documents for each range. While Solr itself produces correctly the desired order (as specified in my query), SolrJ doesn't preserve it. RE: Yonik, a solution could be just to change {code} _facetQuery = new HashMapString, Integer(); ...to... _facetQuery = new Linked HashMapString, Integer(); {code} was: Using Solrj, I wanted to sort the response of a range query based on some specific labels. For instance, using the query: facet=true facet.query={!key= Less than 100}[* TO 99] facet.query={!key=100 - 200}[100 TO 200] facet.query={!key=200 +}[201 TO *] I wanted to display the response in the following order: Less than 100 (x) 100 - 200 (y) 201 + (z) independently on the values of x, y, z which are the numbers of the retrieved documents for each range. While Solr itself produces correctly the desired order (as specified in my query), SolrJ doesn't preserve it. RE: Yonik, a solution could be just to change _facetQuery = new HashMapString, Integer(); to _facetQuery = new Linked HashMapString, Integer(); Issue Type: Bug (was: Improvement) Summary: SolrJ doesn't preserve the order of facet queries returned from solr (was: Sorting range queries: SolrJ doesn't preserve the order produced by Solr) revising summary to clarify the problem, reclassifying as a bug, reformating description to include noformat code tags so it doesn't try to render emoticons. SolrJ doesn't preserve the order of facet queries returned from solr Key: SOLR-1815 URL: https://issues.apache.org/jira/browse/SOLR-1815 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Reporter: Steve Radhouani Original Estimate: 24h Remaining Estimate: 24h Using Solrj, I wanted to sort the response of a range query based on some specific labels. For instance, using the query: {noformat} facet=true facet.query={!key= Less than 100}[* TO 99] facet.query={!key=100 - 200}[100 TO 200] facet.query={!key=200 +}[201 TO *] {noformat} I wanted to display the response in the following order: {noformat} Less than 100 (x) 100 - 200 (y) 201 + (z) {noformat} independently on the values of x, y, z which are the numbers of the retrieved documents for each range. While Solr itself produces correctly the desired order (as specified in my query), SolrJ doesn't preserve it. RE: Yonik, a solution could be just to change {code} _facetQuery = new HashMapString, Integer(); ...to... _facetQuery = new Linked HashMapString, Integer(); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1808) When IndexReader.reopen is called, old reader is not properly closed
[ https://issues.apache.org/jira/browse/SOLR-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1808. Resolution: Not A Problem As mark said: IndexReaders are refcounted (regardless of whether they come from open or reopen) so that they aren't closed until they are no logner in use. I'm not seeing any evidence of a bug here, pelase reopen if you can point to a concrete example of where an IndexReader is being leaked. When IndexReader.reopen is called, old reader is not properly closed Key: SOLR-1808 URL: https://issues.apache.org/jira/browse/SOLR-1808 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.4 Reporter: John Wang According to Lucene documentation: If the index has not changed since this instance was (re)opened, then this call is a NOOP and returns this instance. Otherwise, a new instance is returned. The old instance is not closed and remains usable. In SolrCore.java: if (newestSearcher != null solrConfig.reopenReaders indexDirFile.equals(newIndexDirFile)) { IndexReader currentReader = newestSearcher.get().getReader(); IndexReader newReader = currentReader.reopen(); if (newReader == currentReader) { currentReader.incRef(); } tmp = new SolrIndexSearcher(this, schema, main, newReader, true, true); } When currentReader!=newReader, currentReader seems to be leaking. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1807) UpdateHandler plugin is not fully supported
[ https://issues.apache.org/jira/browse/SOLR-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843356#action_12843356 ] Hoss Man commented on SOLR-1807: No all plugin APIs are created equal ... some like TokenizerFactories are designed to be extended by lots of people, others werent' particularly well thought out abstractions in the first place, and your milage may vary when implementing them -- feel free to post doc patches/suggestions to help make this more clear. as to the specific problem... Even if UpdateHandler had been an abstract class, at best we could have added a version of {{forceOpenWriter()}} that just threw an UnsupportedOpException -- there's no default impl we could have provided that would have worked for any possible UpdateHandler subclass people might have written. The best conceivable solution we probably could have come up with at the time would be to introduce a marker interface that UpdateHandlers could optionaly implement containing the APIs needed to support replication, and make the ReplicationHandler test the registered UpdateHandler on startup to see if it implements that API, and if not then throw an error. This type of solution could still be implemented today, in place of the instanceof DirectUpdateHandler2 ... particularly now that the code has been vetted a little bit by users and we have a pretty good idea of what type of functionality an UpdateHandler needs to support in order to play nice with ReplicationHandler. UpdateHandler plugin is not fully supported --- Key: SOLR-1807 URL: https://issues.apache.org/jira/browse/SOLR-1807 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.4 Reporter: John Wang UpdateHandler is published as a supported Plugin, but code such as the following: if (core.getUpdateHandler() instanceof DirectUpdateHandler2) { ((DirectUpdateHandler2) core.getUpdateHandler()).forceOpenWriter(); } else { LOG.warn(The update handler being used is not an instance or sub-class of DirectUpdateHandler2. + Replicate on Startup cannot work.); } suggest that it is really not fully supported. Must all implementations of UpdateHandler be subclasses of DirectUpdateHandler2 for it to work with replication? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843373#action_12843373 ] Hoss Man commented on SOLR-1743: bq. Okay, I sat down and thought about what we should do before really reading through your suggestion - and I came up with practically the exact same thing - so I think this is what we should attempt. I know i brought it up here in the issue comments, but I think we should probably track this type of change in a separate issue as an Improvement for the scope of this issue, let's start by getting a simpler patch commited that at least restores the behavior from 1.4 -- w/solr.xml you always get missing core name on config error, w/o solr.xml you get good error messages even if solrconfig.xml can't be parsed. It won't help new users in who start with the current example from the trunk (since it has a solr.xml) but it will get things back to where they were for existing users who try upgrading. As i recall one of the patches already posted does this just fine (i just can't remember which one) so that part should be fairly straight forward. error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1805) Possible perf improvement in UnInvertedField by using syncronizing on CreationPlaceholder
Possible perf improvement in UnInvertedField by using syncronizing on CreationPlaceholder - Key: SOLR-1805 URL: https://issues.apache.org/jira/browse/SOLR-1805 Project: Solr Issue Type: Improvement Components: search Reporter: Hoss Man UnInvertedField.getUnInvertedField could probably see some performance improvements in the creation of new UnInvertedField instances if it started synchronizing on a CreationPlaceholder object akin to what FieldCacheImpl does... http://old.nabble.com/Why-synchronized-access-to-FieldValueCache-in-getUninvertedField.java-to27672399.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1772) UpdateProcessor to prune empty values
[ https://issues.apache.org/jira/browse/SOLR-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839812#action_12839812 ] Hoss Man commented on SOLR-1772: bq. I'd almost rather see the default behavior changed rather than to put another configurable component in the chain that would slow things down (slightly) for everyone. That seems backwards -- if FieldType(s) start checking for the empty string, that's a few extra cycles of cost that everyone spends even if their indexing clients are already well behaved and only send real values. Adding it as an optional UpdateProcessor makes it something that only people who need hand holdinghave to spend cycles on. bq. ... confused that the empty string was being indexed at all, for fields that aren't even numbers. They thought this was equivalent to not sending it any value. I haven't verified this first hand but I believe it. Nope: there are many use cases for both strings and numbers where you may need to skip a value in a multiValued field -- parallel arrays and such. ... it's actually one main situations we still have where IntField comes in handy (besides just supporting completely legacy Lucene indexes) UpdateProcessor to prune empty values --- Key: SOLR-1772 URL: https://issues.apache.org/jira/browse/SOLR-1772 Project: Solr Issue Type: Wish Reporter: Hoss Man Users seem to frequently get confused when some FieldTypes (typically the numeric ones) complain about invalid field values when the inadvertantly index an empty string. It would be cool to provide an UpdateProcessor that makes it easy to strip out any fields being added as empty values ... it could be configured using field (and/or field type) names or globs to select/ignore certain fields -- i haven't thought it through all that hard -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1553) extended dismax query parser
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1553: --- Attachment: edismax.unescapedcolon.bug.test.patch On the train this past weekend i started trying to tackle the issue of making support for field based queries (ie: fieldA:valueB) configurable so that it could be turned on/off for certain fields (or left off completely for back-compat with dismax) Based on yonik's description of edismax, and my initial reading of the code (particularly the use of clause.field and getFieldName in ExtendedDismaxQParser) i was under the impression that if a clause consisting of FOO:BAR was encountered, and FOO was not a known field, that the clause would be treated as a literal, and the colon would be escaped before passing it on to ExtendedSolrQueryParser ... essentially that FOO:BAR and FOO\:BAR would be equivalent if FOO is not the name of a real field according to the IndexSchema. For reasons I don't fully understand yet, this isn't the case -- as the attached test shows, the queries are parsed differently, and (evidently) FOO:BAR is parsed as an empty query if FOO is not a real field. Before I try digging into this too much, I wanted to sanity check: * is this expected? ... was this done intentionally? * is this desired? ... is this logical default behavior to have if the field isn't defined? should we have tests to assert this before i start adding more config options to change the behavior? extended dismax query parser Key: SOLR-1553 URL: https://issues.apache.org/jira/browse/SOLR-1553 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Fix For: 1.5 Attachments: edismax.unescapedcolon.bug.test.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch An improved user-facing query parser based on dismax -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1553) extended dismax query parser
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1553: --- Attachment: edismax.userFields.patch FWIW: initial steps towards adding a uf param to let users specify what field names can be specified explicitly in the query string, with optional default boosts to apply to those clauses ... not finished. extended dismax query parser Key: SOLR-1553 URL: https://issues.apache.org/jira/browse/SOLR-1553 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Fix For: 1.5 Attachments: edismax.unescapedcolon.bug.test.patch, edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch An improved user-facing query parser based on dismax -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1796) Lucene -dev versions should be in the SNAPSHOT apache maven repo.
[ https://issues.apache.org/jira/browse/SOLR-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1796. Resolution: Not A Problem Lucene 2.9.1-dev jars were used by Solr trunk temporarily during the vote of Lucene 2.9.2, which is now official, and the jars have been switched. Dealing with any similar future instances would probably need to be dealt with by filing a Lucene issue to get the release candiate jars published to the maven repository. Lucene -dev versions should be in the SNAPSHOT apache maven repo. --- Key: SOLR-1796 URL: https://issues.apache.org/jira/browse/SOLR-1796 Project: Solr Issue Type: Task Components: Build Reporter: David Smiley Lucene 2.9.1 is out of course and in maven repos but the 2.9.1-dev as found in Solr's source control right now is not. This is pretty frustrating and I can only expect it will be a recurring problem. If Solr is going to use lucene -dev versions then I think Solr needs to put them in a repo somewhere. Apache's snapshot repo would make the most sense. FYI the repo manager is now managed by Nexus at this URL: https://repository.apache.org/index.html#nexus-search;quick~lucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)
[ https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839909#action_12839909 ] Hoss Man commented on SOLR-1752: Long term, we could evolve the Solr XML Update format to allow both adds and deletes (and we probably should) but that seems like a seperate issue. Given the current state of hte XML Syntax allowed, it does seem like there is a bug here in that SolrJ will attempt to send illegal XML when it gets an UpdateRequest that contains both adds and deletes. At a minimum SolrJ should notice when it's configured to use XML and the UpdateRequest contains mixed commands and generate a more specific error message before ever attempting to format the commands as XML and send them to a server. It might conceivable make sense to convert the UpdateRequest into multiple server calls -- but i haven't thought that through very far and i'm not sure what that would entail (the error handling would probably be a bit tricky) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer) - Key: SOLR-1752 URL: https://issues.apache.org/jira/browse/SOLR-1752 Project: Solr Issue Type: Bug Components: clients - java, update Affects Versions: 1.4 Reporter: Jayson Minard Assignee: Shalin Shekhar Mangar Priority: Blocker Add this test to SolrExampleTests.java and it will fail when using the XML Request Writer (now default), but not if you change the SolrExampleJettyTest to use the BinaryRequestWriter. {code} public void testAddDeleteInSameRequest() throws Exception { SolrServer server = getSolrServer(); SolrInputDocument doc3 = new SolrInputDocument(); doc3.addField( id, id3, 1.0f ); doc3.addField( name, doc3, 1.0f ); doc3.addField( price, 10 ); UpdateRequest up = new UpdateRequest(); up.add( doc3 ); up.deleteById(id001); up.setWaitFlush(false); up.setWaitSearcher(false); up.process( server ); } {code} terminates with exception: {code} Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461) at com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90) at
[jira] Commented: (SOLR-1553) extended dismax query parser
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839914#action_12839914 ] Hoss Man commented on SOLR-1553: bq. What does u in uf stand for? user fields ... as in field names a user may refer to ... but it's not something i though through to hard, as i said: work in progress. extended dismax query parser Key: SOLR-1553 URL: https://issues.apache.org/jira/browse/SOLR-1553 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Fix For: 1.5 Attachments: edismax.unescapedcolon.bug.test.patch, edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch An improved user-facing query parser based on dismax -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp
[ https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839928#action_12839928 ] Hoss Man commented on SOLR-1750: Committed revision 917812. I went ahead and commited the most recent attachment under the name SystemInfoRequestHandler with slightly generalized javadocs. Leaving the issue open so we make sure to settle the remaining issues before we release... * decide if we want to change the name * add default registration as part of the AdminRequestHandler (ie: /admin/info ?) * add some docs (didn't wnat to make a wiki page until we're certain of hte name) * decide if we want to modify the response structure (should all of the top level info be encapsulated in a container?) SystemStatsRequestHandler - replacement for stats.jsp - Key: SOLR-1750 URL: https://issues.apache.org/jira/browse/SOLR-1750 Project: Solr Issue Type: Improvement Components: web gui Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Fix For: 1.5 Attachments: SystemStatsRequestHandler.java, SystemStatsRequestHandler.java, SystemStatsRequestHandler.java stats.jsp is cool and all, but suffers from escaping issues, and also is not accessible from SolrJ or other standard Solr APIs. Here's a request handler that emits everything stats.jsp does. For now, it needs to be registered in solrconfig.xml like this: {code} requestHandler name=/admin/stats class=solr.SystemStatsRequestHandler / {code} But will register this in AdminHandlers automatically before committing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1750) SystemInfoRequestHandler - replacement for stats.jsp and registry.jsp
[ https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1750: --- Summary: SystemInfoRequestHandler - replacement for stats.jsp and registry.jsp (was: SystemStatsRequestHandler - replacement for stats.jsp) SystemInfoRequestHandler - replacement for stats.jsp and registry.jsp - Key: SOLR-1750 URL: https://issues.apache.org/jira/browse/SOLR-1750 Project: Solr Issue Type: Improvement Components: web gui Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Fix For: 1.5 Attachments: SystemStatsRequestHandler.java, SystemStatsRequestHandler.java, SystemStatsRequestHandler.java stats.jsp is cool and all, but suffers from escaping issues, and also is not accessible from SolrJ or other standard Solr APIs. Here's a request handler that emits everything stats.jsp does. For now, it needs to be registered in solrconfig.xml like this: {code} requestHandler name=/admin/stats class=solr.SystemStatsRequestHandler / {code} But will register this in AdminHandlers automatically before committing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1797) ConcurrentModificationException
[ https://issues.apache.org/jira/browse/SOLR-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839943#action_12839943 ] Hoss Man commented on SOLR-1797: NOTE: Initial thread where Yonik had some more comments about how/where the concurrent modification can come from... http://old.nabble.com/ConcurrentModificationException-to27722422.html ConcurrentModificationException --- Key: SOLR-1797 URL: https://issues.apache.org/jira/browse/SOLR-1797 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.4, 1.5 Environment: Centos 5, Tomcat 6 Reporter: Dan Hertz Priority: Blocker SOLR 1.4 (final) and 1.5 nightly work fine on a Windows box, but on our Centos 5 box, we're getting a ConcurrentModificationException when starting Tomcat 6. Yonik Seeley asked me to start a JIRA bug report, mentioning that, It looks like resourceLoader.newInstance() is fundamentally not thread safe. (SOLR-USER) = = = Log Below: = = = INFO | jvm 1| 2010/02/24 21:27:04 | SEVERE: java.util.ConcurrentModificationException INFO | jvm 1| 2010/02/24 21:27:04 | at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) INFO | jvm 1| 2010/02/24 21:27:04 | at java.util.AbstractList$Itr.next(AbstractList.java:343) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:507) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.core.SolrCore.init(SolrCore.java:606) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.core.CoreContainer.create(CoreContainer.java:429) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.core.CoreContainer.load(CoreContainer.java:285) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardContext.start(StandardContext.java:4450) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardService.start(StandardService.java:516) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) INFO | jvm 1| 2010/02/24 21:27:04 | at org.apache.catalina.startup.Catalina.start(Catalina.java:583) INFO | jvm 1| 2010/02/24 21:27:04 | at
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839948#action_12839948 ] Hoss Man commented on SOLR-1743: Mark: i'm confused by your comments/patch I applied your patch allong with the schema.xml typo patch i posted above to Solr trunk (r917814) and still got missing core name in path when hitting http://localhost:8983/solr/admin/ I thought that since example/solr/conf/solrconfig.xml uses {{abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError}} it would fall into the situation you described as being fixed? (Did you maybe attach a different version of the patch then you ment to?) error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1802) Make Solr work with IndexReaderFactory implementations that return MultiReader
[ https://issues.apache.org/jira/browse/SOLR-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1802: --- Description: When an IndexReaderFactory returns an instance of MultiReader, various places in Solr try to call reader.directory() and reader.getVersion, which results an UnsupportedOperationException. was: When an IndexReaderFactory returns an instance of MultiReader, Solr tries to call reader.directory() and reader.getVersion, which results an UnsupportedOperationException. Custom IndexReaderFactory implementations that return MultiReader instances are common, and I don't there there are documentations that discourage this. Issue Type: Improvement (was: Bug) Summary: Make Solr work with IndexReaderFactory implementations that return MultiReader (was: Solr is not friend to IndexReaderFactory implementations that return MultiReader) editing issue summary to reflect that this is an improvement, not a bug. It was noted when IndexReaderFactory was added that using custom factories was incompatible with a lot of Solr features precisely because of the assumption about reader.directory()... CHANGES.txt when the API was introduced... {noformat} 59. SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader implementations can be specified via solrconfig.xml. Note that using a custom IndexReader may be incompatible with ReplicationHandler (see comments in SOLR-1366). This should be treated as an experimental feature. (Andrzej Bialecki, hossman, Mark Miller, John Wang) {noformat} example solrconfig.xml (the only place the feature is advertised)... {code} !-- Use the following format to specify a custom IndexReaderFactory - allows for alternate IndexReader implementations. ** Experimental Feature ** Please note - Using a custom IndexReaderFactory may prevent certain other features from working. The API to IndexReaderFactory may change without warning or may even be removed from future releases if the problems cannot be resolved. ** Features that may not work with custom IndexReaderFactory ** The ReplicationHandler assumes a disk-resident index. Using a custom IndexReader implementation may cause incompatibility with ReplicationHandler and may cause replication to not work correctly. See SOLR-1366 for details. indexReaderFactory name=IndexReaderFactory class=package.class Parameters as required by the implementation /indexReaderFactory {code} Make Solr work with IndexReaderFactory implementations that return MultiReader -- Key: SOLR-1802 URL: https://issues.apache.org/jira/browse/SOLR-1802 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: John Wang When an IndexReaderFactory returns an instance of MultiReader, various places in Solr try to call reader.directory() and reader.getVersion, which results an UnsupportedOperationException. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840014#action_12840014 ] Hoss Man commented on SOLR-1743: No worries dude .. i don't really even understand how it worked before, let alone with your patch. Your latest version deals with my typo in schema.xml example, but when testing out some other use cases it looks like the default assumption was that abortOnConfigurationError=true unless the solrconfig.xml can be parsed cleanly and sets it to false ... which means that in 1.4 a single core malformed solrconfig.xml (ie: garbage in the prolog) would generate a good error message -- and with your latest patch it still generates the missind core name error. It seems like in order to preserve we need to use tertiary state for the CoreContainer.abortOnConfigurationError ... null assumes true until at least one solrocnfig.xml is parsed cleanly, then false unless at least one config sets it to true. I'm also wondering if your patch breaks the purpose of CoreContainer.Initializer.setAbortOnConfigurationError ... i think the idea there was that prior to initializing the CoreContainer, Embedded Solr users could call that method to force abortOnConfigurationError even if it wasn't set in any of hte solrconfig.xml files. error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1743: --- Attachment: SOLR-1743.patch Ok, here's my attempt at making sense of this. As far as i can tell this restores all of the useful behavior that SOlr 1.4 had with abortOnConfirurationError in single core mode ... some quick multicore testing makes me think it's improved the error reporting in some situations there as well, but i'm sure i haven't tried all of the edge cases -- it may have broken something. error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840021#action_12840021 ] Hoss Man commented on SOLR-1743: G... ignore that last patch, it changes the default behavior to be like abortOnConfigurationError=true for multicores even if no core ever asked for it ... which would be bad (in 1.4 those cores will all still load, but with this patch they won't) Still thinking about it error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1743: --- Attachment: SOLR-1743.patch I think i give up. First off: sorry mark, this comment was way off base... bq. I'm also wondering if your patch breaks the purpose of CoreContainer.Initializer.setAbortOnConfigurationError ...digging through the history i realized that this is how Initializer has always worked: you can set the default behavior for legacy single core mode, but whenever it sees a solr.xml file it overwrites that default value with false This is fundamentally what's bitch slapping me at the moment ... the attached patch tries to mimic the historical behavior, and i think i saw it work (but i'm kinda cross-eeyed right now so i can honestly say you shouldn't take my word for it -- i wouldn't) but it doesn't really address the fact that since the example now contains a solr.xml, anybody who starts with the Solr 1.5 example and makes a typo in their solrconfig.xml so that it's not well formed won't get a useful error message in the browser like they would in Solr 1.4 error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840045#action_12840045 ] Hoss Man commented on SOLR-1743: Okay now i'm just going to rant... abortOnConfigurationError feels like it's just devolved into nonsense at this point .. the orriginal purpose was to let people configure wether they wanted to solr to try to keep running even if something like a request handler couldn't be loaded -- set it to true and solr wouldn't start up and the admin screen would tell you why, set it to false and solr would work, but requests for that request handler would fail once we added multicore support, the usage of abortOnConfigurationError just stoped making sense ... if your solr.xml refers to just core1, and core1's solrconfig.xml sets it to false and has a request handler that can't be llooaded things keep working -- but if you also have a core2 whose solrconfig.xml sets it to true then the whole server won't start up ... that's just silly. Maybe it's just time to rethink the whole damn thing... * deprecate the SolrConfig.SEVERE_ERRORS singleton - make SolrCore start keeping a personal list of exceptions it was able to get past (ie: a plugin it couldn't load) * Eliminate Initializer.isAbortOnConfigurationError - instead make each SolreCore keep track of that itself * if initializing a core throws an exception (either from parsing the config, or from instantiating the SolrCore or IndexSchema) CoreContainer should keep track of that exception as being specific to that core name (MapString,Exception) ** removing a core, or recreating a core with the same name should clear any corresponding entry from this map * when SolrDispatchFilter processes a path, it should generate a useful error message in either case: ** CoreContainer says it has an init exception for the core name that corresponds to that path ** the SolrCore exists; has sAbortOnConfigurationError()=true; and has a non-empty list of exceptions ...thoughts? error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Assignee: Mark Miller Fix For: 1.5 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory
Document peculiar behavior of TestHarness.LocalRequestFactory - Key: SOLR-1792 URL: https://issues.apache.org/jira/browse/SOLR-1792 Project: Solr Issue Type: Improvement Affects Versions: 1.4, 1.3, 1.2, 1.1.0 Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 While working on a test case, i realized that due to method evolution, TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that results in the defaults the factory was configured with being ignored when the method is called with multiple varargs. I spent some time attempting to fix this by adding the defaults to the end of the params, but then discovered that this breaks existing tests because the LRF defaults take precedence over defaults that may be hardcoded into the solrconfig.xml. The internal test might be changed to work arround this, but i didn't want to risk breaking tests for users who might be using TestHarness directly. So this bug is just to track improving the documentation of what exactly LRF.makeRequest does with it's input -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory
[ https://issues.apache.org/jira/browse/SOLR-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1792: --- Attachment: SOLR-1792.patch patch ... i would have already committed this but SVN seems to be down. Document peculiar behavior of TestHarness.LocalRequestFactory - Key: SOLR-1792 URL: https://issues.apache.org/jira/browse/SOLR-1792 Project: Solr Issue Type: Improvement Affects Versions: 1.1.0, 1.2, 1.3, 1.4 Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 Attachments: SOLR-1792.patch While working on a test case, i realized that due to method evolution, TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that results in the defaults the factory was configured with being ignored when the method is called with multiple varargs. I spent some time attempting to fix this by adding the defaults to the end of the params, but then discovered that this breaks existing tests because the LRF defaults take precedence over defaults that may be hardcoded into the solrconfig.xml. The internal test might be changed to work arround this, but i didn't want to risk breaking tests for users who might be using TestHarness directly. So this bug is just to track improving the documentation of what exactly LRF.makeRequest does with it's input -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory
[ https://issues.apache.org/jira/browse/SOLR-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1792. Resolution: Fixed Committed revision 915637. Document peculiar behavior of TestHarness.LocalRequestFactory - Key: SOLR-1792 URL: https://issues.apache.org/jira/browse/SOLR-1792 Project: Solr Issue Type: Improvement Affects Versions: 1.1.0, 1.2, 1.3, 1.4 Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 Attachments: SOLR-1792.patch While working on a test case, i realized that due to method evolution, TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that results in the defaults the factory was configured with being ignored when the method is called with multiple varargs. I spent some time attempting to fix this by adding the defaults to the end of the params, but then discovered that this breaks existing tests because the LRF defaults take precedence over defaults that may be hardcoded into the solrconfig.xml. The internal test might be changed to work arround this, but i didn't want to risk breaking tests for users who might be using TestHarness directly. So this bug is just to track improving the documentation of what exactly LRF.makeRequest does with it's input -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1776) dismax should treate schema's default field as a default qf
[ https://issues.apache.org/jira/browse/SOLR-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1776. Resolution: Fixed Fix Version/s: 1.5 Assignee: Hoss Man Committed revision 915646. dismax should treate schema's default field as a default qf --- Key: SOLR-1776 URL: https://issues.apache.org/jira/browse/SOLR-1776 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 the DismaxQParser is completely useless w/o specifying the qf param, but for the life of me i can't think of any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1776) dismax and edismax should treate schema's default field as a default qf
[ https://issues.apache.org/jira/browse/SOLR-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1776: --- Description: the DismaxQParser (and ExtendedDismaxQParser) is completely useless w/o specifying the qf param, but for the life of me i can't think of any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param. (was: the DismaxQParser is completely useless w/o specifying the qf param, but for the life of me i can't think of any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param.) Summary: dismax and edismax should treate schema's default field as a default qf (was: dismax should treate schema's default field as a default qf) dismax and edismax should treate schema's default field as a default qf --- Key: SOLR-1776 URL: https://issues.apache.org/jira/browse/SOLR-1776 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 the DismaxQParser (and ExtendedDismaxQParser) is completely useless w/o specifying the qf param, but for the life of me i can't think of any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1687) add param for limiting start and rows params
[ https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836930#action_12836930 ] Hoss Man commented on SOLR-1687: bq. Traditionally, this type of stuff is delegated to the front-end clients to restrict. True, but my suggestion wasn't so much along the lines of end users entering really big numbers as much as that client developers might make mistakes, and this would allow a solr admin to lock things down in a sane way. bq. Would it make more sense to add an optional component to check restrictions? The restrictions could optionally be in the config for the component and thus wouldn't have to be looked up and parsed for every request. I like this idea, but given the way local versions of start/rows are treated special wouldn't we still need special like what i added in the patch to deal with them? (a generic component added to the front of the list could check validate a list of global params, but it wouldn't have anyway of knowing for certain what other params later components might parse with a QParser. add param for limiting start and rows params Key: SOLR-1687 URL: https://issues.apache.org/jira/browse/SOLR-1687 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-1687.patch conventional wisdom is that it doesn't make sense to paginate with huge pages, or to drill down deep into high numbered pages -- features like faceting tend to be a better UI experience, and less intensive on solr. At the moment, Sold adminstrators can use invariant params to hardcode the rows param to something reasonable, but unless they only want to allow users to look at page one, the can't do much to lock down the start param expect inforce these rules in the client code we should add new params that set an upper bound on both of these, which can then be specified as default/invarient params in solrconfig.xml -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1772) UpdateProcessor to prune empty values
[ https://issues.apache.org/jira/browse/SOLR-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836933#action_12836933 ] Hoss Man commented on SOLR-1772: Actually my point is that the new FieldTypes make it *more* of an issue (in the eyes of end users) because now Solr errors out on empty (numeric) field values ... having an UpdateProcessor like this would be an easy solution for people who just want a simple way to tell Solr to ignore empty fields (with certain names, or certain types) UpdateProcessor to prune empty values --- Key: SOLR-1772 URL: https://issues.apache.org/jira/browse/SOLR-1772 Project: Solr Issue Type: Wish Reporter: Hoss Man Users seem to frequently get confused when some FieldTypes (typically the numeric ones) complain about invalid field values when the inadvertantly index an empty string. It would be cool to provide an UpdateProcessor that makes it easy to strip out any fields being added as empty values ... it could be configured using field (and/or field type) names or globs to select/ignore certain fields -- i haven't thought it through all that hard -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1786) Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()] fixed in PDFbox 1.0?
[ https://issues.apache.org/jira/browse/SOLR-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1786: --- Fix Version/s: 1.5 marking Fix for 1.5 -- we shouldn't release w/o either moving forward or rollingback the version we use. (FYI: our PDFBox dependency is based on the tika dependency) Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()] fixed in PDFbox 1.0? Key: SOLR-1786 URL: https://issues.apache.org/jira/browse/SOLR-1786 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.5 Environment: Ubuntu 9.10, 32bit Reporter: Jan Iwaszkiewicz Priority: Critical Fix For: 1.5 I tried indexing several thousand PDF documents but could not finish as Solr was falling into an endless loop for some of them, for instance: http://cdsweb.cern.ch/record/702585/files/sl-note-2000-019.pdf (the PDF seems OK). Can Solr start using PDFbox 1.0? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835334#action_12835334 ] Hoss Man commented on SOLR-1695: Doh! Note to self: don't just run the tests, remember to look at the results as well. The DocumentBuilderTest failures make sense: they use a schema with uniqueKey defined, but add docs w/o that field to test other behaviors of toDocument. They passed prior to this change because the only tested to toDocument method in isolation, andthe test for a missing uniqueKey was missing from that method. I think it's safe to consider these tests broken as written, since toDocument does do schema validation -- it just wasn't doing the uniqueKey validation before. So i'll modify those tests to include a value for the uniqueKey field the ConvertedLegacyTest failure confuses me though ... it also adds docs w/o a uniqueKey field even though the schema requires one, but they do full adds so it's not obvious from the surface why it was ever passing before ... i want to think about that a little more before just fixing' the test -- it may be masking another bug. Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field -- Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835364#action_12835364 ] Hoss Man commented on SOLR-1695: Hmmm ok so the reason the legacy test passed prior to this change is that DirectUpdateHandler2 (and DirectUpdateHandler from what i can tell) don't bother checking for a uniqueKey (or for multiple uniqueKeys) if allowDups=true (which it is in the line of ConvertedLEgacyTest that's failing). So the question becomes: Is it a bug that DUH(2) allow docs w/o a uniqueKey field just because allowDups=true? If it's not a bug, then this entire patch should probably be rolled back -- but personally It feels like it really is a bug: if a schema declares a uniqueKey field, then just because a particular add command says allowDups=true doesn't mean that docs w/o an id (or with multiple ids) should be allowed in to the index -- those docs will need meaningful ids if/when a later commit does want to override them (consider the case of doing an initial build w/ allowDups=true for speed, and then incremental updates w/ allowDups=false ... the index needs to be internally consistent. Actually: I'm just going to roll this entire patch back either way -- we can improve the error messages generated by DirectUpdateHandler2 and eliminate the redundant uniqueKey check in DocumentBuilder.toDocument. As a separate issue we can consider whether DUH2 is buggy. Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field -- Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835373#action_12835373 ] Hoss Man commented on SOLR-1695: bq. schema.xml does not require the id field, and the failing add explicitly says allowDups=false (legacy speak for overwrite=false) ...it doesn't require id but it does declare id as the uniqueKey field ... even if it's allowing dups shouldn't it ensure that the docs has 1 and only one value for the uniqueKey field? Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field -- Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1695. Resolution: Fixed Committed revision 911595. rolledback the changes to DocumentBuilder and improved the existing error messages in UpdateHandler instead. Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field -- Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1780) existence of exactly one value for uniqueKey field is not checked when overwrite=false or allowDups=true
existence of exactly one value for uniqueKey field is not checked when overwrite=false or allowDups=true Key: SOLR-1780 URL: https://issues.apache.org/jira/browse/SOLR-1780 Project: Solr Issue Type: Bug Reporter: Hoss Man As noted in SOLR-1695, DirectUpdateHandler(2) when a document is added, the uniqueKey field is only asserted to contain exactly one value if overwrite=true. If overwrite=false (or allowDups=true) then the uniqueKey field is not checked at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1777) fields with sortMissingLast don't sort correctly
[ https://issues.apache.org/jira/browse/SOLR-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835424#action_12835424 ] Hoss Man commented on SOLR-1777: Yonik: just to be verify: was this bug was introduced in Solr 1.4? ... presumably because of the changes to per segment collecting? (that's the way the Affects Version/s is marked, but i want to sanity check in case it was actually a more fundamental problem affecting earlier versions of Solr as well). fields with sortMissingLast don't sort correctly Key: SOLR-1777 URL: https://issues.apache.org/jira/browse/SOLR-1777 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.4 Reporter: Yonik Seeley Assignee: Yonik Seeley Priority: Critical Fix For: 1.5 Attachments: SOLR-1777.patch, SOLR-1777.patch field types with the sortMissingLast=true attribute can have results sorted incorrectly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory
[ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834938#action_12834938 ] Hoss Man commented on SOLR-1365: The constraints on what can be SolrCoreAware exist for two main reasons: # to ensure some sanity in initialization .. one of the main reasons the SolrCoreAware interface was needed in the first place was because some plugins wanted to use the SolrCore to get access to other plugins during their initialization -- but those other components weren't necessarily initialized yet. with the inform(SolrCore) method SolrCoreAware plugins know that all other components have been initialized, but they haven't necessarily been informed about the SolrCore, so they might not be ready to deal with other plugins yet ... it's generally just a big initialization-cluster-fuck, so the fewer classes involved the better # prevent too much pollution of the SolrCore API. having direct access to the SolrCore is a big deal -- once you have a reference to the core, you can get to pretty much anything, which opens us (ie: Solr maintainers) up to a lot of crazy code paths to worry about -- so the fewer plugin types that we need to consider when making changes to SolrCore the better. In the case of SimilarityFactor, i'm not entirely sure how i feel about making it SolrCoreAware(able) ... we have tried really, REALLY hard to make sure nothing initialized as part of the IndexSchema can be SolrCore aware because it opens up the possibility of plugin behavior being affected by SolrCore configuration which might be differnet between master and slave machines -- which could provide disastrous results. a schema.xml needs to be internally consistent regardless of what solrconfig.xml might refrence it. In this case the real issue isn't that we have a use case where SImilarityFactory _needs_ access to SolrCore -- what it wants access to is the IndexSchema, so it might make sense to just provide access to that in some way w/o having to expos the entire SolrCore. Practically speaking, after re-skimming the patch: I'm not even convinced that would eally add anything. refactoring/reusing some of the *code* that IndexSchema uses to manage dynamicFIelds might be handy for the SweetSpotSimilarityFactory, but i don't actual see how being able to inspect the IndexSchema to get the list of dynamicFields (or find out if a field is dynamic) would make it any better or easier to use. We'd still want people to configure it with field names and field name globs directly because there won't necessarily be a one to one correspondence between what fields are dynamic in the schema and how you want the sweetspots defined ... you might have a generic en_* dynamicField in your schema for english text, and an fr_* dynamicField for french text, but that doesn't mean the sweetspot for all fr_* fields will be the same ... you are just as likely to want some very specific field names to have their own sweetspot, or to have the sweetspot be suffix based (ie: *_title could have one sweetspot even the resulting field names are fr_title and en_title. I think the patch could be improved, and i think there is definitely some code reuse possibility for parsing the field name globs, but i don't know that it really needs run time access to the IndexSchema (and it definitely doesn't need access to the SolrCore) Add configurable Sweetspot Similarity factory - Key: SOLR-1365 URL: https://issues.apache.org/jira/browse/SOLR-1365 Project: Solr Issue Type: New Feature Affects Versions: 1.3 Reporter: Kevin Osborn Priority: Minor Fix For: 1.5 Attachments: SOLR-1365.patch This is some code that I wrote a while back. Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields. So, in schema.xml, you could have something like this: similarity class=org.apache.solr.schema.SweetSpotSimilarityFactory bool name=useHyperbolicTftrue/bool float name=hyperbolicTfFactorsMin1.0/float float name=hyperbolicTfFactorsMax1.5/float float name=hyperbolicTfFactorsBase1.3/float float name=hyperbolicTfFactorsXOffset2.0/float int name=lengthNormFactorsMin1/int int name=lengthNormFactorsMax1/int float name=lengthNormFactorsSteepness0.5/float int name=lengthNormFactorsMin_description2/int int
[jira] Commented: (SOLR-741) Add support for rounding dates in DateField
[ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835008#action_12835008 ] Hoss Man commented on SOLR-741: --- bq. With the introduction of Trie fields is it not irrelevant now? can we close it TrieFields make it more efficient to do range searches on numeric fields indexed at full precision, but it doesn't actually do anything to round the fields for people who genuinely want their stored and index values to only have second/minute/hour/day precision regardless of what the initial raw data looks like. So while TrieFields definitely make this less of a priority from a performance standpoint, it doens't solve hte full problem. (Unless i'm missing something, actually rounding the values prior to indexing will still help improve performance in general because it will reduce the total number of Terms ... with TrieFields isn't the original value is always indexed regardless of the precisionStep? Add support for rounding dates in DateField --- Key: SOLR-741 URL: https://issues.apache.org/jira/browse/SOLR-741 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were: # fieldType name=date class=solr.DateField sortMissingLast=trueomitNorms=true roundTo=-1MINUTE / (Shalin) # fieldType name=date class=solr.DateField sortMissingLast=true omitNorms=true round=DOWN_MINUTE / (Otis) Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers. This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-270) dismax handler should not log a warning when sort by score desc is specified
[ https://issues.apache.org/jira/browse/SOLR-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-270. --- Resolution: Fixed This was fixed at some point as a side effect of some other change. dismax handler should not log a warning when sort by score desc is specified Key: SOLR-270 URL: https://issues.apache.org/jira/browse/SOLR-270 Project: Solr Issue Type: Bug Reporter: Hoss Man Priority: Minor http://localhost:8983/solr/select/?indent=onq=videosort=score+descqt=dismax causes a warning to be logged... WARNING: Invalid sort score desc was specified, ignoring ..because of some excentricities in how the getSort method works ... this warning is distracting and missleading ... but only in the case where score desc is used ... it should still be generated for a truely invalid sort. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1679) SolrCore.execute should wrap log message construction in if (log.isInfoEnabled())
[ https://issues.apache.org/jira/browse/SOLR-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1679. Resolution: Fixed Fix Version/s: 1.5 Assignee: Hoss Man Committed revision 911216. Thanks for the suggestion Fuad. SolrCore.execute should wrap log message construction in if (log.isInfoEnabled()) --- Key: SOLR-1679 URL: https://issues.apache.org/jira/browse/SOLR-1679 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 Attachments: SOLR-1679.patch As mentioned by Fuad on solr-user, there is some non-trivial log message construction happening in SolreCore.execute that should be wrapped in if (log.isInfoEnabled()) ... http://old.nabble.com/SOLR-Performance-Tuning%3A-Disable-INFO-Logging.-to26866730.html#a26866943 ...the warn level message in that same method could probably also be wrapped since it does some large string building as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1695) Missleading error message when uniqueKey is field is missing
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1695. Resolution: Fixed Fix Version/s: 1.5 Assignee: Hoss Man Committed revision 911228. Committed revision 911232. I added an explicit checks for the number of uniqueKey values being != 1 early on in DocumentBuilder.toDocument. Prior to this, multiple values weren't checked for until the doc made it all the way to the UpdateHandler. Missleading error message when uniqueKey is field is missing - Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field
[ https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1695: --- Summary: Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field (was: Missleading error message when uniqueKey is field is missing) revising summary Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field -- Key: SOLR-1695 URL: https://issues.apache.org/jira/browse/SOLR-1695 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Fix For: 1.5 Sometimes users don't seem to notice/understand the uniqueKey/ declaration in the example schema, and the error message they get if their documents don't include that field is confusing... {code} org.apache.solr.common.SolrException: Document [null] missing required field: id {code} ...because they get an almost identical error even if they remove {{required=true}} from {{field name=id /}} in their schema.xml file. We should improve the error message so it's clear when a Document is missing the uniqueKeyField (not just a required field) so they know the terminology to look for in diagnosing the problem. http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1687) add param for limiting start and rows params
[ https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835075#action_12835075 ] Hoss Man commented on SOLR-1687: Hmmm... QParser.getSort is where the current sort/start/rows param parsing happens right now, but looking at it makes me realize there's some local params semantics to consider with something like this. Currently, QParser.getSort won't consult the global params if any of sort/start/rows are specified as a local param (or if the caller explicitly says useGlobalParams=false, but there doesn't seem to be a code path where that happens) but what should happen in these situations... {code} #1) q={!lucene rows.max=99 rows=}foorows.max=100 #2) q={!lucene rows.max=100 v=$qq}qq=foorows=999rows.max=999 {code} situation #1 could come up if a greedy client attempted to ask for too many rows, and the admin has a configured invariant of rows.max=100 -- in which case we'd want the global rows.max param to superseded the local rows param. But situation #2 is equally possible where the q param is an invariant set by the admin, and the other params come from a greedy client. The best situation i can think of off the top of my head would be to ignore local param values for start.max and rows.max, and look for them as global params even if false==useGlobalParams. That takes care of situation #1, and makes situation #2 easy to deal with by also adding rows.max=100 as an invariant outside of the local params. Anyone see any holes in that? add param for limiting start and rows params Key: SOLR-1687 URL: https://issues.apache.org/jira/browse/SOLR-1687 Project: Solr Issue Type: Improvement Reporter: Hoss Man conventional wisdom is that it doesn't make sense to paginate with huge pages, or to drill down deep into high numbered pages -- features like faceting tend to be a better UI experience, and less intensive on solr. At the moment, Sold adminstrators can use invariant params to hardcode the rows param to something reasonable, but unless they only want to allow users to look at page one, the can't do much to lock down the start param expect inforce these rules in the client code we should add new params that set an upper bound on both of these, which can then be specified as default/invarient params in solrconfig.xml -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1687) add param for limiting start and rows params
[ https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1687: --- Attachment: SOLR-1687.patch patch with the logic i attempted to describe. it doesn't contain any Unit Tests yet, but it seems to be working. the real question is: are there any any holes i haven't plugged in the local/global param handling logic that a greedy client could exploit? add param for limiting start and rows params Key: SOLR-1687 URL: https://issues.apache.org/jira/browse/SOLR-1687 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: SOLR-1687.patch conventional wisdom is that it doesn't make sense to paginate with huge pages, or to drill down deep into high numbered pages -- features like faceting tend to be a better UI experience, and less intensive on solr. At the moment, Sold adminstrators can use invariant params to hardcode the rows param to something reasonable, but unless they only want to allow users to look at page one, the can't do much to lock down the start param expect inforce these rules in the client code we should add new params that set an upper bound on both of these, which can then be specified as default/invarient params in solrconfig.xml -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-397) options for dealing with range endpoints in date facets
[ https://issues.apache.org/jira/browse/SOLR-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-397: -- Description: Date faceting should support configuration for controlling how edge boundaries are dealt with. (was: as discussed in email... http://www.nabble.com/Re%3A-Date-facetting-and-ranges-overlapping-p12928374.html : I'm now using date facetting to browse events. It works really fine : and is really useful. The only problem so far is that if I have an : event which is exactly on the boundary of two ranges, it is referenced : 2 times. yeah, this is one of the big caveats with date faceting right now ... i struggled with this a bit when designing it, and ultimately decided to punt on the issue. the biggest hangup was that even if hte facet counting code was smart about making sure the ranges don't overlap, the range query syntax in the QueryParser doesn't support ranges that exclude one input (so there wouldn't be a lot you can do with the ranges once you know the counts in them) one idea i had in SOLR-258 was that we could add an interval option that would define how much to add to the end or one range to get the start of another range (think of the current implementation having interval hardcoded to 0) which would solve the problem and work with range queries that were inclusive of both endpoints, but would require people to use -1MILLI a lot. a better option (assuming a query parser change) would be a new option thta says wether each computed range should be enclusive of the low poin,t the high point, both end points, neither end points, or be smart (where smart is the same as low except for the last range where the it includes both) (I think there's already a lucene issue to add the query parser support, i just haven't had time to look at it) The simple workarround: if you know all of your data is indexed with perfect 0.000second precision, then put -1MILLI at the end of your start and end date faceting params. ) (initial issue description moved to comment) as discussed in email... http://www.nabble.com/Re%3A-Date-facetting-and-ranges-overlapping-p12928374.html : I'm now using date facetting to browse events. It works really fine : and is really useful. The only problem so far is that if I have an : event which is exactly on the boundary of two ranges, it is referenced : 2 times. yeah, this is one of the big caveats with date faceting right now ... i struggled with this a bit when designing it, and ultimately decided to punt on the issue. the biggest hangup was that even if hte facet counting code was smart about making sure the ranges don't overlap, the range query syntax in the QueryParser doesn't support ranges that exclude one input (so there wouldn't be a lot you can do with the ranges once you know the counts in them) one idea i had in SOLR-258 was that we could add an interval option that would define how much to add to the end or one range to get the start of another range (think of the current implementation having interval hardcoded to 0) which would solve the problem and work with range queries that were inclusive of both endpoints, but would require people to use -1MILLI a lot. a better option (assuming a query parser change) would be a new option thta says wether each computed range should be enclusive of the low poin,t the high point, both end points, neither end points, or be smart (where smart is the same as low except for the last range where the it includes both) (I think there's already a lucene issue to add the query parser support, i just haven't had time to look at it) The simple workarround: if you know all of your data is indexed with perfect 0.000second precision, then put -1MILLI at the end of your start and end date faceting params. options for dealing with range endpoints in date facets --- Key: SOLR-397 URL: https://issues.apache.org/jira/browse/SOLR-397 Project: Solr Issue Type: Improvement Reporter: Hoss Man Date faceting should support configuration for controlling how edge boundaries are dealt with. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-397) options for dealing with range endpoints in date facets
[ https://issues.apache.org/jira/browse/SOLR-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834597#action_12834597 ] Hoss Man commented on SOLR-397: --- Additional idea that i like much better then the interval idea i had a while back, transcribed from email so it's not lost to the ages... I think the semantics that might make the most sense is to add a multivalued facet.date.include param that supports the following options: all, lower, upper, edge, outer - all is shorthand for lower,upper,edge,outer and is the default (for back compat) - if lower is specified, then all ranges include their lower bound - if upper is specified, then all ranges include their upper bound - if edge is specified, then the first and last ranges include their edge bounds (ie: lower for the first one, upper for the last one) even if the corrisponding upper/lower option is not specified. - the between count is inclusive of each of the start and end bounds iff the first and last range are inclusive of them - the before and after ranges are inclusive of their respective bounds if: -* outer is specified ... OR ... -* the first and last ranges don't already include them so assuming you started with something like (specific dates and durrations shortend for readability)... {{facet.date.start=1 facet.date.end=3 facet.date.gap=+1 facet.date.other=all}} ...your ranges would be... {{[1 TO 2], [2 TO 3] and [* TO 1], [1 TO 3], [3 TO *]}} The following params would change the ranges in the following ways... {code} w/ facet.date.include=lower ... [1 TO 2}, [2 TO 3} and [* TO 1}, [1 TO 3}, [3 TO *] w/facet.date.include=upper ... {1 TO 2], {2 TO 3] and [* TO 1], {1 TO 3], {3 TO *] w/ facet.date.include=lowerfacet.date.include=edge ... [1 TO 2}, [2 TO 3] and [* TO 1}, [1 TO 3], {3 TO *] w/ facet.date.include=upperfacet.date.include=edge ... [1 TO 2], {2 TO 3] and [* TO 1}, [1 TO 3], {3 TO *] w/ facet.date.include=upperfacet.date.include=outer ... {1 TO 2], {2 TO 3] and [* TO 1], {1 TO 3], [3 TO *] ...etc. {code} initial proposal: http://old.nabble.com/RE%3A-Date-Facet-duplicate-counts-p27331578.html options for dealing with range endpoints in date facets --- Key: SOLR-397 URL: https://issues.apache.org/jira/browse/SOLR-397 Project: Solr Issue Type: Improvement Reporter: Hoss Man Date faceting should support configuration for controlling how edge boundaries are dealt with. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1776) dismax should treate schema's default field as a default qf
dismax should treate schema's default field as a default qf --- Key: SOLR-1776 URL: https://issues.apache.org/jira/browse/SOLR-1776 Project: Solr Issue Type: Improvement Reporter: Hoss Man the DismaxQParser is completely useless w/o specifying the qf param, but for the life of me i can't think of any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1579) CLONE -stats.jsp XML escaping
[ https://issues.apache.org/jira/browse/SOLR-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1579. Resolution: Fixed Assignee: Hoss Man (was: Erik Hatcher) I fully expect stats.jsp will be deprecated in the next release of Solr in favor of the handler in SOLR-1750 -- BUT -- I still can't beleive such an anoying and yet trivial to fix bug was arround for so long ... especially since the incorrect fix for the XML attribute escaping is only half the problem: escapeCharData as still needed for the XML ELement content escaping. David: thanks for your prodding on this ... i committed your patch plus some additional fixes (r909705) CLONE -stats.jsp XML escaping - Key: SOLR-1579 URL: https://issues.apache.org/jira/browse/SOLR-1579 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: David Bowen Assignee: Hoss Man Fix For: 1.5 Attachments: SOLR-1579.patch Original Estimate: 1h Remaining Estimate: 1h The fix to SOLR-1008 was wrong. It used chardata escaping for a value that is an attribute value. I.e. instead of XML.escapeCharData it should call XML.escapeAttributeValue. Otherwise, any query used as a key in the filter cache whose printed representation contains a double-quote character causes invalid XML to be generated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1008) stats.jsp XML escaping
[ https://issues.apache.org/jira/browse/SOLR-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1008: --- Fix Version/s: (was: 1.4) 1.5 Note: the fix included in Solr 1.4 was not actually correct, revising version info accordingly. see SOLR-1579 for details stats.jsp XML escaping -- Key: SOLR-1008 URL: https://issues.apache.org/jira/browse/SOLR-1008 Project: Solr Issue Type: Bug Components: web gui Reporter: Erik Hatcher Assignee: Erik Hatcher Fix For: 1.5 Attachments: SOLR-1008.patch Original Estimate: 1h Remaining Estimate: 1h stats.jsp gave this error: Line Number 1327, Column 48:stat name=item_attrFacet_Size__Shape stat names are not XML escaped. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1771) StringIndexDocValues should provide a better error message when getStringIndex fails
[ https://issues.apache.org/jira/browse/SOLR-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1771. Resolution: Fixed Assignee: Hoss Man I'm not convinced that the wording of the new error message is all that great, but it's vastly better then the previous behavior... Committed revision 909746. Note that this affected numerous different class: OrdFieldSource, All the Sortable*Field classes, DateField, and StrField. (anyone instantiating an instance of StringIndexDocValues) StringIndexDocValues should provide a better error message when getStringIndex fails Key: SOLR-1771 URL: https://issues.apache.org/jira/browse/SOLR-1771 Project: Solr Issue Type: Bug Components: search Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.5 if someone attempts to use an OrdFieldSource on a field that is tokenized, FieldCache.getStringIndex throws a confusing RuntimeException that StringIndexDocValues propogates. we should wrap that exception in something more helpful... http://old.nabble.com/sorting-td27544348.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1772) UpdateProcessor to prune empty values
UpdateProcessor to prune empty values --- Key: SOLR-1772 URL: https://issues.apache.org/jira/browse/SOLR-1772 Project: Solr Issue Type: Wish Reporter: Hoss Man Users seem to frequently get confused when some FieldTypes (typically the numeric ones) complain about invalid field values when the inadvertantly index an empty string. It would be cool to provide an UpdateProcessor that makes it easy to strip out any fields being added as empty values ... it could be configured using field (and/or field type) names or globs to select/ignore certain fields -- i haven't thought it through all that hard -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-534) Return all query results with parameter rows=-1
[ https://issues.apache.org/jira/browse/SOLR-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832333#action_12832333 ] Hoss Man commented on SOLR-534: --- bq. But if you use the REALLY_BIG_NUMBER approach, the same bad programmer who never thought he would get back more than a 1000 records will never check whether the result set contains more than 1000 records either. If we're going to assume the programmer doesn't check the actual number found, then why assume that the programmer pays attention to anything in the response at all? If you think it's likley that programmers will write code that only looks at the docList to iterates over all the docs in a response and doesn't notice that the numFound at the top of the docList is higher then the number asked for. then why do you assume that same programmer would be smart enough to check if an error message is returned when they ask for all rows and Solr can't provide them? Bottom line: we can't protect programmers from all possible forms of stupidity stupidity, but we can make them be explicit about exactly what they want -- if they want 100, they ask for 100; if they want 1 they ask for 1, if they want all they have to specify how big they think all is. bq. Solr sure as heck better be checking this already--you never know when you'll run into bizarre low memory conditions;allocations should ALWAYS be checked for. This isn't as easy as it may sound in Java ... the APIS available to test for the amount of memory available are limited, and even if hte JVM has the resources to allocate a 10,000,000 item PiorityQuery when computing the results, that doesn't mean doing so won't eat up all the available RAM causing some later (extremely tiny) allocation to trigger an OOM --- but If you've got a suggestion to help prevent OOM in situations like this, by all means patches welcome. Return all query results with parameter rows=-1 --- Key: SOLR-534 URL: https://issues.apache.org/jira/browse/SOLR-534 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Environment: Tomcat 5.5 Reporter: Lars Kotthoff Priority: Minor Attachments: solr-all-results.patch The searcher should return all results matching a query when the parameter rows=-1 is given. I know that it is a bad idea to do this in general, but as it explicitly requires a special parameter, people using this feature will be aware of what they are doing. The main use case for this feature is probably debugging, but in some cases one might actually need to retrieve all results because they e.g. are to be merged with results from different sources. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1765) HTTP Caching related headers are incorrect for distributed searches
[ https://issues.apache.org/jira/browse/SOLR-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1765: --- Component/s: (was: multicore) (was: search) Description: When searching across multiple shards with HTTP caching enabled, the Caching related headers (ETag, Cache-Control, Last-Modified) in the response are based on the index of the coordinating solr core, and are not influenced by the properties of the shards. For example, take the query http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3 ETag should be calculated off of core2 and core3, instead it's being calculated from the index of core1. This results in index modificaitons to to core2 or core3 being invisible to clients which query this URL using If-None-Match or If-Modified-Since type requests was: When searching across multiple shards with HTTP caching enabled, the ETag value in the response is only using the searcher in the original request, not the shards. For example, take the query http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3 ETag should be calculated off of core2 and core3, instead it's being calculated from core1. Summary: HTTP Caching related headers are incorrect for distributed searches (was: ETag calculation is incorrect for distributed searches) HTTP Caching related headers are incorrect for distributed searches --- Key: SOLR-1765 URL: https://issues.apache.org/jira/browse/SOLR-1765 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Charlie Jackson Priority: Minor When searching across multiple shards with HTTP caching enabled, the Caching related headers (ETag, Cache-Control, Last-Modified) in the response are based on the index of the coordinating solr core, and are not influenced by the properties of the shards. For example, take the query http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3 ETag should be calculated off of core2 and core3, instead it's being calculated from the index of core1. This results in index modificaitons to to core2 or core3 being invisible to clients which query this URL using If-None-Match or If-Modified-Since type requests -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1742) uniqueKey must be string type otherwise missing core name in path error is rendered
[ https://issues.apache.org/jira/browse/SOLR-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1742. Resolution: Duplicate the string dependency for QueryElevationComponent does seem to be the root of the problem -- i think we should try to get the the bottom of why such a confusing error message is reported, but we've already got SOLR-1743 to track that. uniqueKey must be string type otherwise missing core name in path error is rendered - Key: SOLR-1742 URL: https://issues.apache.org/jira/browse/SOLR-1742 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Fix For: 1.5 How to replicate: - create index with schema where you uniqueKet is integer - set your unique key type to integer - deploy your index under http://host:8080/solr/admin/ - you will get missing core name in path Workaround: - change type of your uniqueKet to srting - undeploy and deploy index Its quite confusing as 1.5 is not properly reporting errors and you need to be lucky to find that reason on your own. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors
[ https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829747#action_12829747 ] Hoss Man commented on SOLR-1743: We definitely shouldn't be generating a missing core name in path for situations like missconfiguration in a single core setup. In the trunk, things like attempting to load a RequestHandler class that can't be found correctly result in a Severe errors in solr configuration. type message in the browser, which then shows the stack trace of the problem. However: something as simple as a typoe like this... {code} Index: example/solr/conf/schema.xml === --- example/solr/conf/schema.xml(revision 906596) +++ example/solr/conf/schema.xml(working copy) @@ -456,7 +456,7 @@ when adding a document. -- - field name=id type=string indexed=true stored=true required=true / + field name=id type=asdfasdf indexed=true stored=true required=true / field name=sku type=textTight indexed=true stored=true omitNorms=true/ field name=name type=textgen indexed=true stored=true/ field name=alphaNameSort type=alphaOnlySort indexed=true stored=false/ {code} ...results in http://localhost:8983/solr/admin/ generating the missing core name in path error described, with no other context. In Solr 1.4, this same type of error would have generated a Severe errors in solr configuration. type message (w/ stack trace) so this definitely seems like a new bug in IndexSchema config error handling introduced in the trunk since Solr 1.4 error reporting is rendering 404 missing core name in path for all type of errors --- Key: SOLR-1743 URL: https://issues.apache.org/jira/browse/SOLR-1743 Project: Solr Issue Type: Bug Components: Build Environment: all Reporter: Marcin Fix For: 1.5 despite the error in schema syntax or any other type of error you will always get: 404 missing core name in path communicate. cheers, /Marcin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1754) Legacy numeric types do not check input for bad syntax
[ https://issues.apache.org/jira/browse/SOLR-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829769#action_12829769 ] Hoss Man commented on SOLR-1754: The reason we never explicitly tested the input value was for speed -- if the user says it's an int we trust them. The only places any FieldTypes explicitly validate the input strings (ie: SortableIntField, DateField, etc..) is when they get it free as a side effect of conversion (in DateField's case: even though we index the raw string, we have to parse it anyway looking for DateMath) Is there really any memory efficiency from IntField that can't be achieved with an appropriate precisionStep on TrieIntField? Legacy numeric types do not check input for bad syntax -- Key: SOLR-1754 URL: https://issues.apache.org/jira/browse/SOLR-1754 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Lance Norskog Fix For: 1.5 The legacy numeric types do not check their input values for valid input. A text string is accepted as input for any of these types: IntField, LongField, FloatField, DoubleField. DateField checks its input. In general this is a no-fix, except: that IntField is a necessary memory type because it cuts memory use in sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp
[ https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829773#action_12829773 ] Hoss Man commented on SOLR-1750: bq. Any thoughts on the naming of this beast? SystemInfoHandler sounds good. This would probably also be a good time to retire registry.jsp ... all we need to do is add a few more pieces of system info to this handler (and add some param options to disable the stats part of the output) bq. Also, food for thought, when (hopefully not if) the VelocityResponseWriter is moved into core, we can deprecate stats.jsp and skin the output of this request handler for a similar pleasant view like stats.jsp+client-side xsl does now. Even if/when VelocityResponseWRiter is in the core, i'd still rather just rely on client side XSLT for this to reduce the number of things that could potentially get missconfigured and then confuse people why the page doesn't look right ... the XmlResponseWRriter has always supported a stylesheet param that (while not generally useful to most people) let's you easily reference any style sheet that can be served out of the admin directory ... all we really need is an updatd .xsl file to translate the standard XML format into the old style stats view. SystemStatsRequestHandler - replacement for stats.jsp - Key: SOLR-1750 URL: https://issues.apache.org/jira/browse/SOLR-1750 Project: Solr Issue Type: Improvement Components: web gui Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Fix For: 1.5 Attachments: SystemStatsRequestHandler.java stats.jsp is cool and all, but suffers from escaping issues, and also is not accessible from SolrJ or other standard Solr APIs. Here's a request handler that emits everything stats.jsp does. For now, it needs to be registered in solrconfig.xml like this: {code} requestHandler name=/admin/stats class=solr.SystemStatsRequestHandler / {code} But will register this in AdminHandlers automatically before committing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp
[ https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1750: --- Attachment: SystemStatsRequestHandler.java Some updates to Erik's previous version... # adds everything from registry.jsp #* lucene/solr version info #* source/docs info for each object # forcibly disable HTTP Caching # adds params to control which objects are listed #* (multivalued) cat param restricts category names (default is all) #* (multivalued) key param restricts object keys (default is all) # adds (boolean) stats param to control if stats are outputed for each object #* per-field style override can be used to override per object key # refactored the old nested looping that stast.jsp did over every object and every category into a single pass # switch all HashMaps to NamedLists or SimpleOrderedMaps to preserve predictable ordering Examples... * {{?cat=CACHE}} ** return info about caches, but nothing else (stats disabled by default) * {{?stats=truecat=CACHE}} ** return info and stats about caches, but nothing else * {{?stats=truef.fieldCache.stats=false}} ** Info about everything, stats for everything except fieldCache * {{?key=fieldCachestats=true}} ** Return info and stats for fieldCache, but nothing else I left the class name alone, but i vote for SystemInfoRequestHandler with a default registration of /admin/info SystemStatsRequestHandler - replacement for stats.jsp - Key: SOLR-1750 URL: https://issues.apache.org/jira/browse/SOLR-1750 Project: Solr Issue Type: Improvement Components: web gui Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Fix For: 1.5 Attachments: SystemStatsRequestHandler.java, SystemStatsRequestHandler.java stats.jsp is cool and all, but suffers from escaping issues, and also is not accessible from SolrJ or other standard Solr APIs. Here's a request handler that emits everything stats.jsp does. For now, it needs to be registered in solrconfig.xml like this: {code} requestHandler name=/admin/stats class=solr.SystemStatsRequestHandler / {code} But will register this in AdminHandlers automatically before committing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1754) Legacy numeric types do not check input for bad syntax
[ https://issues.apache.org/jira/browse/SOLR-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829902#action_12829902 ] Hoss Man commented on SOLR-1754: The second array you are talking about only exists if you use the StringIndex based FieldCache. The TrieField subclasses all use the raw primitive FieldCache types, they just use a special parser to decode the Trie value into the raw primitive value ... take a look at o.a.s.schema.TrieField.getSortField. If you look at stats.jsp you can see which FieldCaches are loaded for each field, and verify that all the TreidIntField's you sort on are using a primitive int[], and not a StringIndex. Legacy numeric types do not check input for bad syntax -- Key: SOLR-1754 URL: https://issues.apache.org/jira/browse/SOLR-1754 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Lance Norskog Fix For: 1.5 The legacy numeric types do not check their input values for valid input. A text string is accepted as input for any of these types: IntField, LongField, FloatField, DoubleField. DateField checks its input. In general this is a no-fix, except: that IntField is a necessary memory type because it cuts memory use in sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828835#action_12828835 ] Hoss Man commented on SOLR-1677: bq. I guess I could care less what the default is, if you care about such things you shouldn't be using the defaults and instead specifying this yourself in the schema, and Version has no effect. ...which is all well and good, but it just re-iterates the need for really good documentation about what is impacted by changing a global Version setting -- otherwise users might be depending on a default behavior that is going to change when Version as bumped, and they may not even realize it. Bear in mind: these are just the nuances that people need to worry about when considering a switch from 2.4 to 2.9 to 3.0 ... there will likely be a lot more of these over time. And just to be as crystal clear as i possibly can: * my concern is purely about how to document this stuff. * i do in fact agree that a global luceneVersionMatch option is a good idea Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory --- Key: SOLR-1677 URL: https://issues.apache.org/jira/browse/SOLR-1677 Project: Solr Issue Type: Sub-task Components: Schema and Analysis Reporter: Uwe Schindler Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards compatibility with old indexes created using older versions of Lucene. The most important example is StandardTokenizer, which changed its behaviour with posIncr and incorrect host token types in 2.4 and also in 2.9. In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with much more Unicode support, almost every Tokenizer/TokenFilter needs this Version parameter. In 2.9, the deprecated old ctors without Version take LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer. This patch adds basic support for the Lucene Version property to the base factories. Subclasses then can use the luceneMatchVersion decoded enum (in 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently contains a helper map to decode the version strings, but in 3.0 is can be replaced by Version.valueOf(String), as the Version is a subclass of Java5 enums. The default value is Version.LUCENE_24 (as this is the default for the no-version ctors in Lucene). This patch also removes unneeded conversions to CharArraySet from StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed to match Lucene 3.0. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1718) Carriage return should submit query admin form
[ https://issues.apache.org/jira/browse/SOLR-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828842#action_12828842 ] Hoss Man commented on SOLR-1718: bq. Consider the JIRA interface we are using to comment on this issue. Sure, but that's an {{input type=text /}}, not a {{textarea /}} ... the expected semantics are completely different. With a {{input type=text /}} box the browser already takes care of submitting the form if you hit Enter (and FWIW: most browsers i know of also submit forms if you use Shift-Enter in a {{textarea /}}) It sounds like what you are really suggesting is that we change the /admin/index.jsp form to use a {{input type=text /}} instead of a {{textarea /}} for the q param, and not that we add special (javascript) logic to the form to submit if someone presses Enter inside the existing {{textarea /}} ... which i have a lot less objection to then going out of our way to violate standard form convention. Carriage return should submit query admin form -- Key: SOLR-1718 URL: https://issues.apache.org/jira/browse/SOLR-1718 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 1.4 Reporter: David Smiley Priority: Minor Hitting the carriage return on the keyboard should submit the search query on the admin front screen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828846#action_12828846 ] Hoss Man commented on SOLR-1729: Peter: I think you may have misconstrued my comments -- they were not criticisms of your patch, they were a clarification of why the functionality you are proposing is important. bq. Can you point me toward the class(es) where filter queries' date math lives it's all handled internally by DateField, at which point it has no notion of the request -- I believe this is why yonik suggested using a ThreadLocal variable to track a consistent NOW that any method anywhere in Solr could use (if set) for the current request ... then we just need something like SolrCore to set it on each request (or accept it as a parm if specified) bq. As filter queries are cached separately, can you think of any potential caching issues relating to filter queries? The cache keys for things like that are the Query objects themselves, and at that point the DateMath strings (including NOW) have already been resolved into realy time values so that shouldn't be an issue. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1749) debug output should include explanation of what input strings were passed to the analzyers for each field
debug output should include explanation of what input strings were passed to the analzyers for each field - Key: SOLR-1749 URL: https://issues.apache.org/jira/browse/SOLR-1749 Project: Solr Issue Type: Wish Components: search Reporter: Hoss Man Users are frequently confused by the interplay between Query Parsing and Query Time Analysis (ie: markup meta-characters like whitespace and quotes, multi-word synonyms, Shingles, etc...) It would be nice if we had more debugging output available that would help eliminate this confusion. The ideal API that comes to mind would be to include in the debug output of SearchHandler a list of every string that was Analyzed, and what list of field names it was analyzed against. This info would not only make it clear to users what exactly they should cut/paste into the analysis.jsp tool to see how their Analyzer is getting used, but also what exactly is being done to their input strings prior to their Analyzer being used. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1749) debug output should include explanation of what input strings were passed to the analzyers for each field
[ https://issues.apache.org/jira/browse/SOLR-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1282#action_1282 ] Hoss Man commented on SOLR-1749: This is an idea that's been rolling arround in my head for a while, and today I thought i'd spend some time experimenting with it. It seemed like the main impelmentation challenge would be that by the time you are deep enough down in the code to be using an Analyzer, you don't have access to the SolrQueryRequest to record the debugging info. I thought of two potential solutions... * Use ThreadLocal to track the debugging info if needed * Use Proxy Wrapper classes to record the debugging info if needed I initially figured that writing proxy classes for SolrQueryRequest, IndexSchema, and Analyzer would be relatively straight forward, so i started down that path and discovered two anoying problems... # IndexSchema is currently final # not all code paths use IndexSchema.getQueryAnalyzer(), many fetch the FieldTypes and ask them for their Analyzer directly. The second problem isn't insurmountable, but it complicates things in that it would require Proxy wrappers for FieldType as well. The first problem requires a simple change, but carries with it some baggage that i wasn't ready to embrace. In both cases i started to be very bothered by the long term maintenance something like this would introduce. It would be very easy to write these Proxy classes that extend IndexSchema, FieldType, and Analyzer but it would be just as easy to forget to add the appropriate Proxy methods to them down the road when new methods are added to those base classes. The issue with the FieldType also exposed a flaw in the idea of using ThreadLocal: if we only had to worry about IndexSearcher.getQueryAnalyzer(), we could modify it to check ThreadLocal easily enough, but at the FieldType level we would only be able to modify FieldTypes that ship with Solr, and we'd be missing any plugin FieldTypes, So i aborted the experiment but i figured i should post the feature idea, and my existing thoughts, here in case anyone had other suggestions on how it could be implemented feasibly. debug output should include explanation of what input strings were passed to the analzyers for each field - Key: SOLR-1749 URL: https://issues.apache.org/jira/browse/SOLR-1749 Project: Solr Issue Type: Wish Components: search Reporter: Hoss Man Users are frequently confused by the interplay between Query Parsing and Query Time Analysis (ie: markup meta-characters like whitespace and quotes, multi-word synonyms, Shingles, etc...) It would be nice if we had more debugging output available that would help eliminate this confusion. The ideal API that comes to mind would be to include in the debug output of SearchHandler a list of every string that was Analyzed, and what list of field names it was analyzed against. This info would not only make it clear to users what exactly they should cut/paste into the analysis.jsp tool to see how their Analyzer is getting used, but also what exactly is being done to their input strings prior to their Analyzer being used. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1739) index of facet fields are not same as original string in record
[ https://issues.apache.org/jira/browse/SOLR-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1739. Resolution: Not A Problem w/o knowing more details about your schema, this seems to be working exactly as expected. If you have a question about behavior you are seeing from solr, and are not 100% certain that it is a bug (bug == not functioning as documented) then you should post a question to the solr-user mailing list before opening a Jira issue. In a nutshell: faceting works based on the indexed values; if you need/want different constraints to be displayed for a facet field, then you should use a different analyzer. index of facet fields are not same as original string in record --- Key: SOLR-1739 URL: https://issues.apache.org/jira/browse/SOLR-1739 Project: Solr Issue Type: Bug Affects Versions: 1.4 Environment: Solr search engine is deployed in tomcat and running in windows OS. Reporter: Uma Maheswari Hi, I am new to Solr. I found facets fields does not reflect the original string in the record. For example, the returned xml is, - doc str name=g_numberG-EUPE/str /doc - lst name=facet_counts lst name=facet_queries / - lst name=facet_fields - lst name=g_number int name=gupe1/int /lst /lst - lst name=facet_dates / /lst Here, G-EUPE is displayed under facet field as 'gupe' where it is not capital and missing '-' from the original string. Is there any way we could fix this to match the original text in record? Thanks in advance. Regards, uma -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1603) Perl Response Writer
[ https://issues.apache.org/jira/browse/SOLR-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12806616#action_12806616 ] Hoss Man commented on SOLR-1603: {quote} the output is a complex Perl data structure with search results which would presumably immediately be assigned to a variable - not eval'd. Absolutely agree with Erik and Yonik - I can't think of a realistic case in which this would present a security risk. {quote} The only way (i know of) to utilize a string based representation of a data structure like this in perl is using eval to convert it from a string representation to the intended data structures... bq. I'm aware of the risk of eval'ing untrusted strings, but I'm not sure how this could be a problem with a Solr response. ...The issue is that If you have a network service whose output format is only useful when evaled by the client, then even if that service only ever produces serialized data (and not serialized code) it still opens the client up to man in the middle attacks where a malicious server can generate a response that _does_ include malicious code, and that code is executed by the client ... man in the middle attacks of something like XML that provide tainted data are bad enough, but the possibility of tainted code is really sketchy. As i said before: i'm not making any statements about this patch being more/less safe then any of the other existing response writers that are only useful when evaled in a particular language interpreter -- my point was that while I have never had any clear notion about how/when evaling strings from an external source was considered acceptable in those language communities (the example of python's literal_eval is a good one), I _am_ a heavy perl user, and i do know that the Perl community as a whole actively discourages using eval to deserialize perl from remote services -- this is precisely why things like YAML and the Storable API were created. Both have options to control how they should behave if/when code is encountered in the serialized data. I can see value in adding an output format designed to be trivially useful for perl, but i don't feel comfortable advertising something for Perl users that directly violates Perl best practices -- Particularly when we already have two writers that are fairly easy to use from perl anyway (XML and JSON) Perl Response Writer Key: SOLR-1603 URL: https://issues.apache.org/jira/browse/SOLR-1603 Project: Solr Issue Type: New Feature Components: Response Writers Reporter: Claudio Valente Priority: Minor Attachments: SOLR-1603.2.patch, SOLR-1603.patch I've made a patch that implements a Perl response writer for Solr. It's nan/inf and unicode aware. I don't know whether some fields can be binary but if so I can probably extend it to support that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805167#action_12805167 ] Hoss Man commented on SOLR-1677: bq. And here are the JIRA issues for stemming bugs, since you didnt take my hint to go and actually read them. sigh. I read both those issues when you filed them, and I agreed with your assessment that they are bugs we should fix -- if i had thought you were wrong i would have said so in the issue comments. But that doesn't change the fact that sometimes people depend on buggy behavior -- and sometimes those people depend on the buggy behavior without even realizing it. Bug fixes in a stemmer might make it more correct according to the stemmer algorithm specification, or the language semantics, but in some peculuar use cases an application might find the correct implementation less useful then the previous buggy version. This is one reason why things like CHANGES.txt are important: to draw attention to what has changed between two versions of a piece of software, so people can make informed opinions about what they should test in their own applications when they upgrade things under the covers. luceneMatchVersion should be no different. We should try to find a simple way to inform people when you switch from luceneMatchVersion=X to luceneMatchVersion=Y here are the bug fixes you will get so they know what to test to determine if they are adversely affected by that bug fix in some way (and find their own work around) bq. Perhaps you should come up with a better example than stemming, as you don't know what you are talking about. 1) It's true, I frequently don't know what i'm talking about ... this issue was a prime example, and i thank you, Uwe, and Miller for helping me realize that i was completely wrong in my understanding about the intended purpose of o.a.l.Version, and that a global setting for it in Solr makes total sense -- But that doesn't make my concerns about documenting the affects of that global setting any less valid. 2) Perhaps you should read the StopFilter example i already posted in my last comment... {quote} bq. Robert mentioned in an earlier comment that StopFilter's position increment behavior changes depending on the luceneMatchVersion -- what if an existing Solr 1.3 user notices a bug in some Tokenizer, and adds {{luceneMatchVersion3.0/luceneMatchVersion}} to his schema.xml to fix it. Without clear documentation n _everything_ that is affected when doing that, he may not realize that StopFilter changed at all -- and even though the position incrememnt behavior may now be more correct, it might drasticly change the results he gets when using dismax with a particular qs or ps value. Hence my point that this becomes a serious documentation concern: finding a way to make it clear to users what they need to consider when modifying luceneMatchVersion. {quote} Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory --- Key: SOLR-1677 URL: https://issues.apache.org/jira/browse/SOLR-1677 Project: Solr Issue Type: Sub-task Components: Schema and Analysis Reporter: Uwe Schindler Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards compatibility with old indexes created using older versions of Lucene. The most important example is StandardTokenizer, which changed its behaviour with posIncr and incorrect host token types in 2.4 and also in 2.9. In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with much more Unicode support, almost every Tokenizer/TokenFilter needs this Version parameter. In 2.9, the deprecated old ctors without Version take LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer. This patch adds basic support for the Lucene Version property to the base factories. Subclasses then can use the luceneMatchVersion decoded enum (in 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently contains a helper map to decode the version strings, but in 3.0 is can be replaced by Version.valueOf(String), as the Version is a subclass of Java5 enums. The default value is Version.LUCENE_24 (as this is the default for the no-version ctors in Lucene). This patch also removes unneeded conversions to CharArraySet from StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed to match Lucene 3.0. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1603) Perl Response Writer
[ https://issues.apache.org/jira/browse/SOLR-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805186#action_12805186 ] Hoss Man commented on SOLR-1603: I realize this is analogous to the python, php, and ruby writers, but while i can't speak much to how those (language) communities feel about evaling code from remote sources to generate data structures, i know that the majority of the Perl community considers that a bad practice ... it's the reason things like YAML was created: to allow simple serialization w/o needing to execute untrusted code. So i'm a little leery about adding this (beyond my general leeryness of adding code w/o tests). Perl Response Writer Key: SOLR-1603 URL: https://issues.apache.org/jira/browse/SOLR-1603 Project: Solr Issue Type: New Feature Components: Response Writers Reporter: Claudio Valente Priority: Minor Attachments: SOLR-1603.patch I've made a patch that implements a Perl response writer for Solr. It's nan/inf and unicode aware. I don't know whether some fields can be binary but if so I can probably extend it to support that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1718) Carriage return should submit query admin form
[ https://issues.apache.org/jira/browse/SOLR-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805190#action_12805190 ] Hoss Man commented on SOLR-1718: I don't understand what you mean. both forms use a {{textarea}}, why should the behavior of one textarea be different from the behavior of the other (and every other html textarea on the web) ? Carriage return should submit query admin form -- Key: SOLR-1718 URL: https://issues.apache.org/jira/browse/SOLR-1718 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 1.4 Reporter: David Smiley Priority: Minor Hitting the carriage return on the keyboard should submit the search query on the admin front screen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805247#action_12805247 ] Hoss Man commented on SOLR-1725: Some random comments/questions from the peanut gallery... 1) what is the value add in making ScriptUpdateProcessorFactory support multiple scripts ? ... wouldn't it be simpler to require that users declare multiple instances of ScriptUpdateProcessorFactory (that hte processor chain already executes in sequence) then to add sequential processing to the ScriptUpdateProcessor? 2) The NamedList init args can be as deep of a data structure as you want, so something like this would be totally feasible (if desired) ... {code} processor class=solr.ScriptUpdateProcessorFactory lst name=scripts lst name=updateProcessor1.js bool name=someParamNametrue/bool int name=someOtherParamName3/int /lst lst name=updateProcessor2.js bool name=fooParamtrue/bool str name=barParam3/str /lst /lst lst name=otherProcessorOPtionsIfNeeded ... /lst /processor {code} Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1728) ResponseWriters should support byte[], ByteBuffer
[ https://issues.apache.org/jira/browse/SOLR-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805291#action_12805291 ] Hoss Man commented on SOLR-1728: Noble: your issue description is a bit terse, so i'm a little confused. Are you suggesting an API change such that binary write methods are added to QueryResponseWriter (making it equivalent to BinaryQueryResponseWriter) ? Or are you suggesting that the existing classes which implement QueryResponseWriter ( JSONResponseWriter, PHPResponseWriter, PythonResponseWriter, XMLResponseWriter, etc...) should start implementing BinaryQueryResponseWriter? In either case: what's the motivation? ResponseWriters should support byte[], ByteBuffer - Key: SOLR-1728 URL: https://issues.apache.org/jira/browse/SOLR-1728 Project: Solr Issue Type: Improvement Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Only BinaryResponseWriter supports byte[] and ByteBuffer. Other writers also should support these -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1729) Date Facet now override time parameter
[ https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805313#action_12805313 ] Hoss Man commented on SOLR-1729: bq. (e.g. they are in a different time-zone, not time-synced etc.). time-zones should be irrelevant since all calculations are done in UTC ... lack of time-sync is a legitimate concern, but the more serious problem is distributed requests and network lag. Even if all of the boxes have synchronized clocks, they might not all get queried at the exact same time, and multiple requets might be made to a single server for different phrases of the distributed request that expect to get the same answers. It should be noted that while adding support to date faceting for this type of when is now? is certainly _necessary_ to make distributed date faceting work sanely, it is not _sufficient_ ... unless filter queries that use date math also respect it the counts returned from date faceting will still potentially be non-sensical. Date Facet now override time parameter -- Key: SOLR-1729 URL: https://issues.apache.org/jira/browse/SOLR-1729 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Environment: Solr 1.4 Reporter: Peter Sturge Priority: Minor Attachments: FacetParams.java, SimpleFacets.java This PATCH introduces a new query parameter that tells a (typically, but not necessarily) remote server what time to use as 'NOW' when calculating date facets for a query (and, for the moment, date facets *only*) - overriding the default behaviour of using the local server's current time. This gets 'round a problem whereby an explicit time range is specified in a query (e.g. timestamp:[then0 TO then1]), and date facets are required for the given time range (in fact, any explicit time range). Because DateMathParser performs all its calculations from 'NOW', remote callers have to work out how long ago 'then0' and 'then1' are from 'now', and use the relative-to-now values in the facet.date.xxx parameters. If a remote server has a different opinion of NOW compared to the caller, the results will be skewed (e.g. they are in a different time-zone, not time-synced etc.). This becomes particularly salient when performing distributed date faceting (see SOLR-1709), where multiple shards may all be running with different times, and the faceting needs to be aligned. The new parameter is called 'facet.date.now', and takes as a parameter a (stringified) long that is the number of milliseconds from the epoch (1 Jan 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. This was chosen over a formatted date to delineate it from a 'searchable' time and to avoid superfluous date parsing. This makes the value generally a programatically-set value, but as that is where the use-case is for this type of parameter, this should be ok. NOTE: This parameter affects date facet timing only. If there are other areas of a query that rely on 'NOW', these will not interpret this value. This is a broader issue about setting a 'query-global' NOW that all parts of query analysis can share. Source files affected: FacetParams.java (holds the new constant FACET_DATE_NOW) SimpleFacets.java getFacetDateCounts() NOW parameter modified This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as it's a general change for date faceting, it was deemed deserving of its own patch. I will be updating SOLR-1709 in due course to include the use of this new parameter, after some rfc acceptance. A possible enhancement to this is to detect facet.date fields, look for and match these fields in queries (if they exist), and potentially determine automatically the required time skew, if any. There are a whole host of reasons why this could be problematic to implement, so an explicit facet.date.now parameter is the safest route. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.