[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-04-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855079#action_12855079
 ] 

Hoss Man commented on SOLR-1163:


I vote for a separate war: primarily because i think it would be a good way to 
encourage/force us to make sure that all the functionality needed to power a 
good GUI is exposed via RequestHandlers, (wheich will help make it easy for 
other people to write their own custom tools for controlling solr in 
non-standartd ways).  

If it lives in the same war, it's too easy to just directly access public 
Java level APIs that don't have an HTTP corollary.

That said: i don't see any downside to it being a contrib living right in the 
solr code base.

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1801) Delete record by id which is inserted using dedupe processor

2010-04-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853445#action_12853445
 ] 

Hoss Man commented on SOLR-1801:


I'm really not sure what the 'bug here is.  There are lots of use cases where 
delete by Id isn't practical -- this is just one more of those cases.

Perhaps you should start a thread on solr-user explaining your full usecase a 
little better, and clarifying what it is you want/need/expect to do when using 
deduplication that you don't feel you can do right now.

 Delete record by id which is inserted using dedupe processor
 

 Key: SOLR-1801
 URL: https://issues.apache.org/jira/browse/SOLR-1801
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.4
Reporter: Subroto Sanyal
 Fix For: 1.5


 A record added with unique key generated by dedupe processor can't be deleted 
 using delete by id as the id is generated by hashing and is unknown to the 
 user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1553) extended dismax query parser

2010-04-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853447#action_12853447
 ] 

Hoss Man commented on SOLR-1553:


Jonathan: looking at the code, it seems completely plausible, ant that's the 
direction i was going down with that previous patch --  but i got hung up on 
the fact that for reasons i couldn't identify, clauses refering to fieldnames 
that don't exist in the schema are getting dropped -- need to track down where 
that is happening and stop it, so the new code can look at those field names 
and treat them as aliases (to resolve to other fields)

I just need to find time to dig into it more -- but if you want to take a swing 
at fixing edismax.unescapedcolon.bug.test.patch and then improving on 
edismax.userFields.patch, by all means be my guest.

 extended dismax query parser
 

 Key: SOLR-1553
 URL: https://issues.apache.org/jira/browse/SOLR-1553
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.5

 Attachments: edismax.unescapedcolon.bug.test.patch, 
 edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch


 An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1860) improve stopwords list handling

2010-04-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853679#action_12853679
 ] 

Hoss Man commented on SOLR-1860:


bq. I like the idea of an export - it's transparent and neatly handles back 
compat concerns.

that's the same conclusion robert and i came to on IRC ... being able to load 
directly sounds less redundent, but as soon as a user wants to customize (and 
let's face it: stop words can easily be domain specific) qe need a way of 
exporting that's convenient even for novice users who don't know anything about 
jars and wars.

bq. Not sure at this point if it makes more sense trying to put a text_fr, etc, 
in the normal schema.xml or in a separate schema_intl.xml.

The idea robert pitched on IRC was to create a new example solr-instance 
directory with a barebones solrconfig.xml file, and a schema.xml file that 
*only* demonstrated fields using various tricks for various lanagues.  All the 
language specific stopword files would then live in this new instancedir.  The 
idea being that people interested in non-english fields, could find a 
recommended fieldtype declaration in this schema.xml file, and cut/paste it 
to their schema.xml (probably copied from the main example)

The key here being that we don't want an entire clone of the example (all the 
numeric fields, and multiple request handler declarations,etc...)  this will 
just show the syntax for declaring all the various langages that we can provide 
suggestions for.

bq. As far as file format: I think we sould also support the snowball stopword 
format.

Agreed, but it's a trivially minor chicken/egg choice.  Either we can setup a 
simple export and conversion to the format Solr currently supports now, and 
if/when someon updates StopFilterFactory to support the new format, then we can 
stop converting when we export; or we can modify StopFilter to support both 
formats first, and then setup the simple export w/o worrying about conversion.  
 

Frankly: If Robert's planning on doing the work either way, I'm happy to let 
him decide which approach makes the most sense.


 improve stopwords list handling
 ---

 Key: SOLR-1860
 URL: https://issues.apache.org/jira/browse/SOLR-1860
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.1
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Minor

 Currently Solr makes it easy to use english stopwords for StopFilter or 
 CommonGramsFilter.
 Recently in lucene, we added stopwords lists (mostly, but not all from 
 snowball) to all the language analyzers.
 So it would be nice if a user can easily specify that they want to use a 
 french stopword list, and use it for StopFilter or CommonGrams.
 The ones from snowball, are however formatted in a different manner than the 
 others (although in Lucene we have parsers to deal with this).
 Additionally, we abstract this from Lucene users by adding a static 
 getDefaultStopSet to all analyzers.
 There are two approaches, the first one I think I prefer the most, but I'm 
 not sure it matters as long as we have good examples (maybe a foreign 
 language example schema?)
 1. The user would specify something like:
  filter class=solr.StopFilterFactory 
 fromAnalyzer=org.apache.lucene.analysis.FrenchAnalyzer .../
  This would just grab the CharArraySet from the FrenchAnalyzer's 
 getDefaultStopSet method, who cares where it comes from or how its loaded.
 2. We add support for snowball-formatted stopwords lists, and the user could 
 something like:
 filter class=solr.StopFilterFactory 
 words=org/apache/lucene/analysis/snowball/french_stop.txt format=snowball 
 ... /
 The disadvantage to this is they have to know where the list is, what format 
 its in, etc. For example: snowball doesn't provide Romanian or Turkish
 stopword lists to go along with their stemmers, so we had to add our own.
 Let me know what you guys think, and I will create a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1842) DataImportHandler ODBC keeps lock on the source table while optimisatising is being run...

2010-03-31 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851945#action_12851945
 ] 

Hoss Man commented on SOLR-1842:


Yeah, i don't know anything about ODBC, but it seems odd that DIH wouldn't 
commit any transactions it opens to release the table locks.  (unless this is 
something to do with auto generated transactions in the ODBC Connector)

Marcin: it woul be helpful if you could provide a specific example of a DIH 
config in which you see this problem (the simpler the better) ... perhaps you 
are using some feature of DIH in a way that is unexpected and that's why the 
table locks are living longer then they should.

 DataImportHandler ODBC keeps lock on the source table while optimisatising is 
 being run...
 --

 Key: SOLR-1842
 URL: https://issues.apache.org/jira/browse/SOLR-1842
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.5
Reporter: Marcin

 Hi Guys,
 I don't know if its really a bug but I think its quite good place for it.
 The problem is with dataImportHandler and DB queries.
 For example:
 Let's have a big table which keeps docs to being indexed, we are running 
 query against it on a datimporthandler and query locks table which is quite 
 obvius and desire behaviour from the SQL points of view but while 
 optimisation is being done its should not allow to issue query because in 
 that case table is being locked till optimisation process will finish which 
 can take a time...
 As a workaround you can use select SQL_BUFFER_RESULT... statment which will 
 move everything into temp table and release all locks but still 
 dataImportHandlerwill be waiting for optimisation to finish. Which means you 
 will be able to insert new docs into main table at least.
 cheers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851092#action_12851092
 ] 

Hoss Man commented on SOLR-1848:


I'm confused by yonik's comment...

bq .What's the motivation for including them in the solr webapp?

I agree, adding things to solr.war just for the purpose of the example/tutorial 
is a bad idea, but from what i can tell Grant's commit didn't do that -- it 
just added configuration so that people running java -jar start.jar had both 
the solr webapp running as well as a static webapp containing a form.  if 
they copied the solr.war file, or the example/solr home they wouldn't be 
affected at all.

I suppose for people who copy the *entire* example directory there might be 
some unnecessary stuff -- but that's going to always be true (unless we get rid 
of exampledocs)

Frankly though: the queries.html is so simple, i really don't understand why we 
wouldn't just expand the tutorial to include those links.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851131#action_12851131
 ] 

Hoss Man commented on SOLR-1848:


Agreed: as we start adding sections, it will maek a lot of sense to split the 
tutorial out into multiple pages: the (existing) intro page shoiwing how easy 
it is to load data and do basic queries w/faceting and highlighting, a second 
page showing off spatial queries, a third page showing spell check (and myabe 
more like this), DIH should have a page, etc...

With the possible exception of distributed search (where multiple ports need to 
be up and running) there's no reason all of these things can't be demoed from a 
single example.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (SOLR-1672) RFE: facet reverse sort count

2010-03-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reopened SOLR-1672:



reopening ... not sure why it was marked resolved

 RFE: facet reverse sort count
 -

 Key: SOLR-1672
 URL: https://issues.apache.org/jira/browse/SOLR-1672
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Java, Solrj, http
Reporter: Peter Sturge
Priority: Minor
 Attachments: SOLR-1672.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 As suggested by Chris Hosstetter, I have added an optional Comparator to the 
 BoundedTreeSetLong in the UnInvertedField class.
 This optional comparator is used when a new (and also optional) field facet 
 parameter called 'facet.sortorder' is set to the string 'dsc' 
 (e.g. f.facetname.facet.sortorder=dsc for per field, or 
 facet.sortorder=dsc for all facets).
 Note that this parameter has no effect if facet.method=enum.
 Any value other than 'dsc' (including no value) reverts the BoundedTreeSet to 
 its default behaviour.
  
 This change affects 2 source files:
  UnInvertedField.java
 [line 438] The getCounts() method signature is modified to add the 
 'facetSortOrder' parameter value to the end of the argument list.
  
 DIFF UnInvertedField.java:
 - public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix) throws IOException {
 + public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix, String facetSortOrder) throws IOException {
 [line 556] The getCounts() method is modified to create an overridden 
 BoundedTreeSetLong(int, Comparator) if the 'facetSortOrder' parameter 
 equals 'dsc'.
 DIFF UnInvertedField.java:
 - final BoundedTreeSetLong queue = new BoundedTreeSetLong(maxsize);
 + final BoundedTreeSetLong queue = (sort.equals(count) || 
 sort.equals(true)) ? (facetSortOrder.equals(dsc) ? new 
 BoundedTreeSetLong(maxsize, new Comparator()
 { @Override
 public int compare(Object o1, Object o2)
 {
   if (o1 == null || o2 == null)
 return 0;
   int result = ((Long) o1).compareTo((Long) o2);
   return (result != 0 ? result  0 ? -1 : 1 : 0); //lowest number first sort
 }}) : new BoundedTreeSetLong(maxsize)) : null;
  SimpleFacets.java
 [line 221] A getFieldParam(field, facet.sortorder, asc); is added to 
 retrieve the new parameter, if present. 'asc' used as a default value.
 DIFF SimpleFacets.java:
 + String facetSortOrder = params.getFieldParam(field, facet.sortorder, 
 asc);
  
 [line 253] The call to uif.getCounts() in the getTermCounts() method is 
 modified to pass the 'facetSortOrder' value string.
 DIFF SimpleFacets.java:
 - counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix);
 + counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix, facetSortOrder);
 Implementation Notes:
 I have noted in testing that I was not able to retrieve any '0' counts as I 
 had expected.
 I believe this could be because there appear to be some optimizations in 
 SimpleFacets/count caching such that zero counts are not iterated (at least 
 not by default)
 as a performance enhancement.
 I could be wrong about this, and zero counts may appear under some other as 
 yet untested circumstances. Perhaps an expert familiar with this part of the 
 code can clarify.
 In fact, this is not such a bad thing (at least for my requirements), as a 
 whole bunch of zero counts is not necessarily useful (for my requirements, 
 starting at '1' is just right).
  
 There may, however, be instances where someone *will* want zero counts - e.g. 
 searching for zero product stock counts (e.g. 'what have we run out of'). I 
 was envisioning the facet.mincount field
 being the preferred place to set where the 'lowest value' begins (e.g. 0 or 1 
 or possibly higher), but because of the caching/optimization, the behaviour 
 is somewhat different than expected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1672) RFE: facet reverse sort count

2010-03-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849755#action_12849755
 ] 

Hoss Man commented on SOLR-1672:


Some old notes on this patch that i just found on my laptop (presumably from 
the last time i was on a plane) ...

* The existing patch is in a weird format that i coulnd't apply
* re-reading the patch, and comparing to the SimpleFacets and UnInvertedField 
source, i'm noticing some that several code paths for facet counts aren't being 
accounted for
* I think what we should do conceptially is refactor all of the code that looks 
at the existing FacetParams.FACET_SORT param (or any of the constant values for 
it) into a helper function thta parses the new legal values we want to support 
and returns a Comparator, and then start passing that comparator arround to the 
various strategies (termenum, fieldcache, uninverted) for collecting facet 
constraints, instead of just passing arround the sort string value...
** true,count,count desc = a comparator that does descending count sort
** count asc = a comparator that does ascending count sort
** false,index,index asc = null (by returning a null comparator we would 
be signalling that no sorting or bounded collection is needed, terms can be 
processed in order)
** index desc = a comparater that does descendeing term sort (not requested 
in this Jira, but recently asked about on the mailing list)
* The problem with that conceptual solution is that UnInvertedField doesn't 
maintain a BoundedTreeSet of CountPairs like all of hte other code paths, it 
uses a single Long to encode both the count and the index of the term, so it 
would need some special logic.
** Side question: I wonder if that Long encoded format would work for hte field 
cache based faceting as well?


 RFE: facet reverse sort count
 -

 Key: SOLR-1672
 URL: https://issues.apache.org/jira/browse/SOLR-1672
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Java, Solrj, http
Reporter: Peter Sturge
Priority: Minor
 Attachments: SOLR-1672.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 As suggested by Chris Hosstetter, I have added an optional Comparator to the 
 BoundedTreeSetLong in the UnInvertedField class.
 This optional comparator is used when a new (and also optional) field facet 
 parameter called 'facet.sortorder' is set to the string 'dsc' 
 (e.g. f.facetname.facet.sortorder=dsc for per field, or 
 facet.sortorder=dsc for all facets).
 Note that this parameter has no effect if facet.method=enum.
 Any value other than 'dsc' (including no value) reverts the BoundedTreeSet to 
 its default behaviour.
  
 This change affects 2 source files:
  UnInvertedField.java
 [line 438] The getCounts() method signature is modified to add the 
 'facetSortOrder' parameter value to the end of the argument list.
  
 DIFF UnInvertedField.java:
 - public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix) throws IOException {
 + public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix, String facetSortOrder) throws IOException {
 [line 556] The getCounts() method is modified to create an overridden 
 BoundedTreeSetLong(int, Comparator) if the 'facetSortOrder' parameter 
 equals 'dsc'.
 DIFF UnInvertedField.java:
 - final BoundedTreeSetLong queue = new BoundedTreeSetLong(maxsize);
 + final BoundedTreeSetLong queue = (sort.equals(count) || 
 sort.equals(true)) ? (facetSortOrder.equals(dsc) ? new 
 BoundedTreeSetLong(maxsize, new Comparator()
 { @Override
 public int compare(Object o1, Object o2)
 {
   if (o1 == null || o2 == null)
 return 0;
   int result = ((Long) o1).compareTo((Long) o2);
   return (result != 0 ? result  0 ? -1 : 1 : 0); //lowest number first sort
 }}) : new BoundedTreeSetLong(maxsize)) : null;
  SimpleFacets.java
 [line 221] A getFieldParam(field, facet.sortorder, asc); is added to 
 retrieve the new parameter, if present. 'asc' used as a default value.
 DIFF SimpleFacets.java:
 + String facetSortOrder = params.getFieldParam(field, facet.sortorder, 
 asc);
  
 [line 253] The call to uif.getCounts() in the getTermCounts() method is 
 modified to pass the 'facetSortOrder' value string.
 DIFF SimpleFacets.java:
 - counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix);
 + counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix, facetSortOrder);
 Implementation Notes:
 I have noted in testing that I was not able to retrieve any '0' counts as I 
 had expected.
 I believe this could be because there 

[jira] Created: (SOLR-1846) Remove support for (broken) abortOnConfigurationError

2010-03-25 Thread Hoss Man (JIRA)
Remove support for (broken) abortOnConfigurationError
-

 Key: SOLR-1846
 URL: https://issues.apache.org/jira/browse/SOLR-1846
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man


Setting abortOnConfigurationError==false has not worked for some time, and 
based on a POLL of existing users, no one seems to need/want it, so we should 
remove support for it completely to make error handling and reporting work more 
cleanly.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1846) Remove support for (broken) abortOnConfigurationError

2010-03-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1846:
---

Attachment: SOLR-1846.patch


Attached patch should get us to a good point to tackle some of the related 
issues.  It updates all code paths (unless i missed one) that put something 
into SolrConfig.severeErrors so that that code path also explicilty throws the 
corrisponding exception.

This seems to be working well and is a good base for building up better 
per-core error reporting in SolrDispatchFilter (because now all the exceptions 
can be propogated up to CoreContainer and tracked per core)

As is, this patch breaks BadIndexSchemaTest ... and i'm not really sure what 
the 'right' fix is ... the test explicitly expects a bad schema.xml to be 
loaded properly, and then looks for 3 errors in SOlrCOnfig.severeErrors -- 
errors are still added to severeErrors befor getting thrown, btu the test still 
errors out during setUp because the SolrCore can't be inited (because the 
IndexSchema doesn't finishing initing)

my best suggestion: split the test into three test, each using a differnet 
config (one per type of error tested) and assert that we get an exception 
during setUp.

 Remove support for (broken) abortOnConfigurationError
 -

 Key: SOLR-1846
 URL: https://issues.apache.org/jira/browse/SOLR-1846
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-1846.patch


 Setting abortOnConfigurationError==false has not worked for some time, and 
 based on a POLL of existing users, no one seems to need/want it, so we should 
 remove support for it completely to make error handling and reporting work 
 more cleanly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1846) Remove support for (broken) abortOnConfigurationError

2010-03-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1846:
---

Description: 
Setting abortOnConfigurationError==false has not worked for some time, and 
based on a POLL of existing users, no one seems to need/want it, so we should 
remove support for it completely to make error handling and reporting work more 
cleanly.

http://n3.nabble.com/POLL-Users-of-abortOnConfigurationError-tt484030.html#a484030

  was:
Setting abortOnConfigurationError==false has not worked for some time, and 
based on a POLL of existing users, no one seems to need/want it, so we should 
remove support for it completely to make error handling and reporting work more 
cleanly.



 Remove support for (broken) abortOnConfigurationError
 -

 Key: SOLR-1846
 URL: https://issues.apache.org/jira/browse/SOLR-1846
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-1846.patch


 Setting abortOnConfigurationError==false has not worked for some time, and 
 based on a POLL of existing users, no one seems to need/want it, so we should 
 remove support for it completely to make error handling and reporting work 
 more cleanly.
 http://n3.nabble.com/POLL-Users-of-abortOnConfigurationError-tt484030.html#a484030

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types

2010-03-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1832.


Resolution: Won't Fix

 abortOnConfigurationError=false no longer works for most plugin types
 -

 Key: SOLR-1832
 URL: https://issues.apache.org/jira/browse/SOLR-1832
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man

 In 1.4 setting the abortOnConfigurationError config option to false only 
 affects RequestHandlers and schema related classes (ie: FieldType and 
 Token*Factories).
 ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to 
 initalize properly in Solr 1.4 will cause the entire SolrCore to fail to 
 initialize.  This changed from previous version: In Solr 1.3 a failure to 
 init any of these types of plugins when abortOnConfigurationError=false would 
 result in errors being logged on init, but the SolrCore itself would still 
 work and only attempts to use those plugins  would result in an error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1834) Document level security

2010-03-23 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848946#action_12848946
 ] 

Hoss Man commented on SOLR-1834:


Anders: I only had a few moments to skim your patch, but it seems like a very 
cool feature, thank you (and Findwise) for contributing this.

One thing i noticed was that there didn't seem to be a lot of documentation 
(javadoc or otherwise) ... i see that the demo application you cited seems to 
have some good overview documentation on how all the pieces fit together, and 
what configuration should look like -- if your intention is that this 
documentation can also be used by Solr, would you mind attaching it to the Jira 
issue (as HTML, or in javadoc comments on the java files themselves) with the 
Grant ... Apache License ... box checked off so there's a clear audit log 
that the documentation can be reproduced within Solr?

 Document level security
 ---

 Key: SOLR-1834
 URL: https://issues.apache.org/jira/browse/SOLR-1834
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Anders Rask
 Attachments: SOLR-1834.patch


 Attached to this issue is a patch that includes a framework for enabling 
 document level security in Solr as a search component. I did this as a Master 
 thesis project at Findwise in Stockholm and Findwise has now decided to 
 contribute it back to the community. The component was developed in spring 
 2009 and has been in use at a customer since autumn the same year.
 There is a simple demo application up at 
 http://demo.findwise.se:8880/SolrSecurity/ which also explains more about the 
 component and how to set it up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1823) XMLWriter throws ClassCastException on writing maps other than String,?

2010-03-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1823.


   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Hoss Man

Nice catch Frank.

FWIW: the original intent was that any of those types of objects could be used 
as the *value* of a Map, not the key -- but that's still no excuse to just cast 
the key instead of using stringification (i could have sworn it was already 
doing that)

The one subtlety that your patch broke however is that if someone uses null as 
a key in the Map, that has always been written out as an entry w/o a key -- but 
by using String.valueOf your patch allways produces a non-null string value 
(ie: the 4 character string null) so i modified your patch to just use 
toString() with an explicit null check.

Committed revision 925031.

 XMLWriter throws ClassCastException on writing maps other than String,?
 -

 Key: SOLR-1823
 URL: https://issues.apache.org/jira/browse/SOLR-1823
 Project: Solr
  Issue Type: Improvement
  Components: documentation, Response Writers
Reporter: Frank Wesemann
Assignee: Hoss Man
 Fix For: 1.5

 Attachments: SOLR-1823.patch


 http://lucene.apache.org/solr/api/org/apache/solr/response/SolrQueryResponse.html#returnable_data
  says that a Map containing any of the items in this list may be contained 
 in a SolrQueryResponse and will be handled by QueryResponseWriters.
 This is not true for (at least) Keys in Maps.
 XMLWriter tries to cast keys to Strings. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1824) partial field types created on error

2010-03-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847166#action_12847166
 ] 

Hoss Man commented on SOLR-1824:


Scratch that -- i get it now:

* IndexSchem uses anonymous subclasses of AbstractPluginLoader to instantiate a 
variety of differnet things
* AbstractPluginLoader processes things in a loop, recording errors in 
SolrConfig.severeErrors when a particular instance can't be inited, but 
creating the rest of the objects just fine.
* when abortOnConfigurationError=false this results in solr using a schema with 
missing filters (or missing fields, etc...) .. the only thing that protects 
people when abortOnConfigurationError=true is that SolrDispatchFilter pays 
attention to both abortOnConfigurationError and SolrConfig.severeErrors 
(someone using embedded Solr might never notice the error at all, even if the 
config did say abortOnConfigurationError=true)

 partial field types created on error
 

 Key: SOLR-1824
 URL: https://issues.apache.org/jira/browse/SOLR-1824
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Yonik Seeley
Priority: Minor

 When abortOnConfigurationError=false, and there is a typo in one of the 
 filters in a chain, the field type is still created by omitting that 
 particular filter.  This is particularly dangerous since it will result in 
 incorrect indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types

2010-03-18 Thread Hoss Man (JIRA)
abortOnConfigurationError=false no longer works for most plugin types
-

 Key: SOLR-1832
 URL: https://issues.apache.org/jira/browse/SOLR-1832
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man


In 1.4 setting the abortOnConfigurationError config option to false only 
affects RequestHandlers and schema related classes (ie: FieldType and 
Token*Factories).

ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to initalize 
properly in Solr 1.4 will cause the entire SolrCore to fail to initialize.  
This changed from previous version: In Solr 1.3 a failure to init any of these 
types of plugins when abortOnConfigurationError=false would result in errors 
being logged on init, but the SolrCore itself would still work and only 
attempts to use those plugins  would result in an error.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1832) abortOnConfigurationError=false no longer works for most plugin types

2010-03-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847170#action_12847170
 ] 

Hoss Man commented on SOLR-1832:


This seems to be a result of switching away from the 
(Map|NamedList)PluginLoader classses when the PluginInfo API was added.  The 
PluginLoaders would loop over multiple plugins recording any errors in 
SolrCOnfig.severErrors but then proceeding -- SolrCore.initPlugins on the other 
hand fails fast.


 abortOnConfigurationError=false no longer works for most plugin types
 -

 Key: SOLR-1832
 URL: https://issues.apache.org/jira/browse/SOLR-1832
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man

 In 1.4 setting the abortOnConfigurationError config option to false only 
 affects RequestHandlers and schema related classes (ie: FieldType and 
 Token*Factories).
 ValueSourceParsers, QParserPlugins, and ResposneWriters which fail to 
 initalize properly in Solr 1.4 will cause the entire SolrCore to fail to 
 initialize.  This changed from previous version: In Solr 1.3 a failure to 
 init any of these types of plugins when abortOnConfigurationError=false would 
 result in errors being logged on init, but the SolrCore itself would still 
 work and only attempts to use those plugins  would result in an error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847174#action_12847174
 ] 

Hoss Man commented on SOLR-1817:


I started looking a little more closely at the singleton 
SolrCOnfig.severErrors, since eliminating usage of it is really the key to 
being able to support abortOnConfigurationError=true/false for multiple cores 
independently.  The more I looked at it, the more this whole thing seems futile.

For starters:  I discovered SOLR-1832 ... in a nutshell a failure to init most 
types of plugins in 1.4 causes SolrCore to fail to init, regardless of whether 
abortOnConfigurationError=false.  The only types of plugins where 
initialization failures are logged but the remaining instances are loaded 
anyway is RequestHandlers and schema related classes (ie: FieldType and 
Token*Factories) ... but as noted in SOLR-1824 it's actually a really, really, 
REALLY bad thing for IndexSchema to ignore when a FieldType or analysis 
factory can't be initialized, because it could result in incorrect values 
getting indexed.

So we could:
* fix SOLR-1824 so that any init error in IndexSchema caused a hard fail.
* fix SOLR-1832 so that SolrCore.initPlugins skipped any instances that 
failed to init and recorded the exceptions directly with the SolrCore.
* officially deprecated the *PluginLoader classes, and remove the spots where 
it adds to SolrCOnfig.severErrors

...so then we wouldn't have anyone writting to SolrCOnfig.severErrors anymore.

But should we even bother?

I'm starting to think the whole idea of abortOnConfigurationError is a bad idea 
... especially if no one noticed SOLR-1832 before now.  Maybe we should just 
kill the whole concept, and have SolrCore initialization fail fast and 
propogate any type of Exception up to CoreContainer?



 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847177#action_12847177
 ] 

Hoss Man commented on SOLR-1817:



Tangential Comment...

If we *do* decide that it's worth keeping abortOnConfigurationError, then my 
earlier suggestion of how it should work was overly complicated...

{quote}

a) SolrCore should itself maintain a list of Severe Initialization Exceptions 
that it was able to get passed when initializing itself. specificly: when a 
plugin could not be initialized, and it therefore is ignoring that plugin 
declaration.

b) SolrCore should expose an easy way of asking it for it's list of 
initialization exceptions

c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if 
the core should be usable if there were severe initialization exceptions.

d) SolrCore should refuse to execute any requests if (a) contains Exceptions 
and (c) is true

{quote}

There's really no reason for SolrCore to maintain/expose a special list of 
Exceptions and fail to execute if solrconfig says it should.

Instead: SolrCore can maintain a list of Exception during is initialization and 
then if solrconfig.xml says abortOnConfigurationError=true, the the last line 
of the SolrCore constructor can check check if the list is empty, and throw a 
nice fat SOlrException wrapping that List if it's not, which CoreContainer can 
keep track of (just like any solrconfig.xml or schema.xml parse exceptions that 
it might encounter before that.)

(this would change the behavior of new SolrCore when 
abortOnConfigurationError=true for embedded users constructing it themselves -- 
but frankly i think it changes it in a good way)


 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-16 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1743.


Resolution: Fixed

Committed revision 923909.

NOTE: since this bug was introduced after 1.4, and since i expect it to get 
superceeded by SOLR-1817 prior to the next release, I didn't bother with a 
CHANGES.txt entry

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.restore14behavior.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846061#action_12846061
 ] 

Hoss Man commented on SOLR-1817:


Hey mark: i've only had a chance to skim your patch so far, and i'm still not 
sure if i have jury duty today, so i don't know if i'll have any time to really 
test it out this afternoon, but here are my quick impressions (mixed with my 
thoughts on how to do this before i saw your patch):

1) fundementally we have two differnet kinds of initialization exceptions -- 
the ones SolrCore can deal with and keep going, and the ones that are complete 
show stoppers. Regardless of what the abortOnServerCOnfError configuration 
looks like, it seems like these exceptions should be tracked separately.  We 
should let SolrCore catch and keep track of any exceptions that it can ignore 
while still providing functionality; but if anything it can't deal with occurs 
it should just throw it and then let caller (ie: CoreContainer) keep track of 
it.  

That way SOlrCore (and the errors it's tracking) are still usable by embedded 
users who may not even be using coreContainer (i think there's an NPE 
possibility there in your current patch ... if people construct a SolrCore 
without a CoreDescriptor)

2) It looks like you still have SolrDispatchFilter looking at 
SolrConfig.severeErrors.  It seems like the logic there should be something 
like...

{code}
SolrCore core = coreContainer.getCoreNyName(corepath)
if (null == core) {
  Throwable err = coreContainer.getCoreInitError(corepath)
  if (null == err) {
write_init_errors_for_core(corepath, err)
  }
}
if (core.abortOnConfigError()  0  core.getSevereErrors().size()) {
   write_init_errors_for_core(corepath, err)
}
{code}

3) we should think about how the no-arg behavior of the CoreAdminHandler should 
deal with reporting about cores that couldn't be initialized

 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846217#action_12846217
 ] 

Hoss Man commented on SOLR-1817:


Some rore comments now that i've read things a little more in depth...

* I should have read your comments more carefully, you already noted the 
remaining usages of SolrConfig.severErrors
* two of the places you removed adds to SolrConfig.severErrors are in 
IndexSchema, where exceptions are logged, but not thrown (so your new code 
never sees them).
** Personally i think this is fine, because I don't think those two situations 
really fit the definition of a severe init error as it was designed (ie: a 
plugin like a request handler which might not be used in many situations can't 
be initialized).
** I think errors in IndexSchema init should either be fatal (ie: thrown by the 
constructor and prevent the core from ever working) or just logged as being 
bad news
** FWIW, the two code places i'm talking about are...
*** when a field is declared twice
*** when a dynamicfield is declared twice
* why is admin suddenly a magic alias for  in SolrDispatchFilter? (line 196)
* the big comment about servlet container behavior if you throw an error during 
init doesn't make sense where you copied it (in doFilter, line 292)

 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846227#action_12846227
 ] 

Hoss Man commented on SOLR-1817:


bq. this is part of the big open issue I think is left here - how to properly 
deal with abortOnServerConfError.

Here's what i think makes the most sense in a multi-core world, and is the most 
in the spirit of what that options was ment to do when it was added for 
single cores.

a) SolrCore should itself maintain a list of Severe Initialization Exceptions 
that it was able to get passed when initializing itself. specificly: when a 
plugin could not be initialized, and it therefore is ignoring that plugin 
declaration.

b) SolrCore should expose an easy way of asking it for it's list of 
initialization exceptions

c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if 
the core should be usable if there were severe initialization exceptions.

d) SolrCore should refuse to execute any requests if (a) contains Exceptions 
and (c) is true

e) SolrCore should throw any exceptions it can't get passed

f) CoreContainer should keep track of which core names completely failed to 
initialize, and what exception was encountered while trying (ie: 
MapSolrCore,Throwable ... no List needed).  This should be the first 
exception involved -- even if it came from trying to instantiate the 
IndexSchema, or parse the solrcofig.xml file before it ever got to the 
SolrCore.  CoreContainer shouldn't know/care about (a) or (c)

g) CoreContainer should provide an easy way to query for (f) by core name

h) If SolrDispatchFilter asks CoreContainer for a corename, and no SolrCore is 
found with that name, it should then use (g) to generate an error message

i) SolrDispatchFilter shouldn't know/care about (a) or (c) ... it should just 
ask SolrCore to execute a request, and SolrCore should fail as needed based on 
it's settings (this will potentially allow things like SOLR-141 to work even 
with init errors, as long as the ResponseWriter was initialized successfully)

j) SolrConfig.severeErrors should be deprecated, but for back-compat SolrCore 
and CoreContainer can add to it whenever they add an exception to their own 
internal state.

k) CoreContainer.Initializer.*AbortOnConfigurationError should be deprecated, 
but can still continue to provide the same behavior they do on the trunk (ie: 
influence the default value for each core prior to init, and return true if 
any of the cores have a value of true for that property after init)

l) we could concievable make solr.xml have it's own abortOnConfigError type 
property, but frankly i think if there are *any* errors in solr.xml, that 
should just be a stop the world type situation, where  
CoreContainer.Initializer.initialize() just throws a big fat error and 
CoreContainer deals with it ... i can't think of any good that could possibly 
come from letting solr proceed when it encounteres an error in solr.xml



 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846230#action_12846230
 ] 

Hoss Man commented on SOLR-1817:


Ugh... lots of cross talk, sorry still procesisng some of your earlier 
comments...

{quote}
That check is actually just there so that if you ask for solr/admin, you will 
end up getting the  core - so it makes sense to only allow it in if the 
corename is admin anyway.

Though I have never really liked that logic where it looks for the admin core 
and when it can't find it it drops to the  core.
{quote}

Hmmm... so then this is special behavior needed to make the admin/*.jsp type 
URLs work with the default core?

then why was this check only added as part of your patch for this issue?  how 
do the admin JSPs work on the current trunk?

In either case: why do we need to specificly test for admin shouldn't that 
code path just fall through regardless of the patch? ie: 
{code}
if (core == null  errors == null) { 
  corename=; 
  core=cores.getCore(corename); 
}
{code}

? 


 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1824) partial field types created on error

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846235#action_12846235
 ] 

Hoss Man commented on SOLR-1824:


Can someone point me to a method name and/or line number? ... i'm not following 
what exactly is the current bug. 

(particularly with regards to abortOnConfigurationError=false ... nothing in 
IndexSchema has ever looked at that config option, so if it has any problem 
initing a field/fieldtype it should be throwing an exception and completly 
failing to initialize -- so i don't see how the problem could be any 
better/worse depending on the value of that option)

 partial field types created on error
 

 Key: SOLR-1824
 URL: https://issues.apache.org/jira/browse/SOLR-1824
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Yonik Seeley
Priority: Minor

 When abortOnConfigurationError=false, and there is a typo in one of the 
 filters in a chain, the field type is still created by omitting that 
 particular filter.  This is particularly dangerous since it will result in 
 incorrect indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1817) Fix Solr error reporting to work correctly with multicore

2010-03-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12846237#action_12846237
 ] 

Hoss Man commented on SOLR-1817:


bq. That is the issue where that possible NPE is - getting access to the core 
name.

Just ot be clear: it's not just the core name -- you've got code that assumes a 
SolrCore.getCoreDescriptor() will allways be non null, but that's not allways 
going to be true.

 Fix Solr error reporting to work correctly with multicore
 -

 Key: SOLR-1817
 URL: https://issues.apache.org/jira/browse/SOLR-1817
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1817.patch, SOLR-1817.patch, SOLR-1817.patch


 Here is a rough patch that attempts to fix how error reporting works with 
 multi-core (not in terms of logs, but what you see on an http request).
 The patch is not done - more to consider and havn't worked with how this 
 changes solrconfigs abortOnConfigurationError, but the basics are here.
 If you attempt to access the path of a core that could not load, you are 
 shown the errors that kept the core from properly loading.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1803) ExtractingRequestHandler does not propagate multiple values to a multi-valued field

2010-03-15 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845691#action_12845691
 ] 

Hoss Man commented on SOLR-1803:


Lance: i agree that the current semantics are either poorly definied, or not 
very useful, but your suggestion seems like it overlooks what is probably the 
two most common cases:
 * to have literal values that overwrite/replace extracted values
 * to have literal values that act as defaults unless extracted values are 
found
...those seem like they should both be possible for single and multivalued 
fields

 ExtractingRequestHandler does not propagate multiple values to a multi-valued 
 field
 ---

 Key: SOLR-1803
 URL: https://issues.apache.org/jira/browse/SOLR-1803
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Lance Norskog
Priority: Minor
 Attachments: display-extracting-bug.patch


 When multiple values for one field are extracted from a document, only the 
 last value is stored in the document. If one or more values are given as 
 parameters, those values are all stored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-11 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1743:
---

Attachment: SOLR-1743.restore14behavior.patch

Ok, I've been doing some more testing...

First off: a lot of my early comments on this issue were inaccurate -- in some 
cases I was trying to test the behavior of trunk using a single core example 
with some errors in the solrconfig.xml, but i was using the example/solr dir 
on the trunk, and i completly forgot that it has a solr.xml file in it now.

From what i can tell, the only real difference between the behavior of trunk, 
and the behavior of Solr 1.4 is that: in 1.4 when using legacy single core 
mode (ie: no solr.xml) you would get good error messages if an low level error 
happened that completely prevented the core from loading (ie: schema init 
problem, or xml parsing problem with solrconfig.xml)  This is because the 
default behavior of abortOnConfigurationError was true for legacy single 
core mode, and that boolean drives SolrDispatchFilter's decision about what 
type of error message to display.

The latest attached patch (SOLR-1743.restore14behavior.patch) should get us 
back to the error reporting behavior of Solr 1.4 -- i think we should go ahead 
and commit this to the trunk as a temporary fix for the current bug, while we 
flesh out improvements to the entire concept of abortOnConfigurationError in 
another issue.

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.restore14behavior.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1815) SolrJ doesn't preserve the order of facet queries returned from solr

2010-03-10 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1815:
---

Description: 
Using Solrj, I wanted to sort the response of a range query based on some 
specific labels. For instance, using the query:

{noformat}
facet=true
facet.query={!key= Less than 100}[* TO 99]
facet.query={!key=100 - 200}[100 TO 200]
facet.query={!key=200 +}[201 TO *]
{noformat}

I wanted to display the response in the following order:

{noformat}
Less than 100 (x)
100 - 200 (y)
201 + (z)
{noformat}

independently on the values of x, y, z which are the numbers of the retrieved 
documents for each range.

While Solr itself produces correctly the desired order (as specified in my 
query), SolrJ doesn't preserve it. 

RE: Yonik, a solution could be just to change
{code}
_facetQuery = new HashMapString, Integer();
...to...
_facetQuery = new Linked HashMapString, Integer();
{code}
 


  was:
Using Solrj, I wanted to sort the response of a range query based on some 
specific labels. For instance, using the query:

facet=true
facet.query={!key= Less than 100}[* TO 99]
facet.query={!key=100 - 200}[100 TO 200]
facet.query={!key=200 +}[201 TO *]

I wanted to display the response in the following order:

Less than 100 (x)
100 - 200 (y)
201 + (z)

independently on the values of x, y, z which are the numbers of the retrieved 
documents for each range.

While Solr itself produces correctly the desired order (as specified in my 
query), SolrJ doesn't preserve it. 

RE: Yonik, a solution could be just to change

_facetQuery = new HashMapString, Integer();
to
_facetQuery = new Linked HashMapString, Integer();

 


 Issue Type: Bug  (was: Improvement)
Summary: SolrJ doesn't preserve the order of facet queries returned 
from solr  (was: Sorting range queries: SolrJ doesn't preserve the order 
produced by Solr)

revising summary to clarify the problem, reclassifying as a bug, reformating 
description to include noformat  code tags so it doesn't try to render 
emoticons.

 SolrJ doesn't preserve the order of facet queries returned from solr
 

 Key: SOLR-1815
 URL: https://issues.apache.org/jira/browse/SOLR-1815
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4
Reporter: Steve Radhouani
   Original Estimate: 24h
  Remaining Estimate: 24h

 Using Solrj, I wanted to sort the response of a range query based on some 
 specific labels. For instance, using the query:
 {noformat}
 facet=true
 facet.query={!key= Less than 100}[* TO 99]
 facet.query={!key=100 - 200}[100 TO 200]
 facet.query={!key=200 +}[201 TO *]
 {noformat}
 I wanted to display the response in the following order:
 {noformat}
 Less than 100 (x)
 100 - 200 (y)
 201 + (z)
 {noformat}
 independently on the values of x, y, z which are the numbers of the retrieved 
 documents for each range.
 While Solr itself produces correctly the desired order (as specified in my 
 query), SolrJ doesn't preserve it. 
 RE: Yonik, a solution could be just to change
 {code}
 _facetQuery = new HashMapString, Integer();
 ...to...
 _facetQuery = new Linked HashMapString, Integer();
 {code}
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1808) When IndexReader.reopen is called, old reader is not properly closed

2010-03-09 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1808.


Resolution: Not A Problem

As mark said: IndexReaders are refcounted (regardless of whether they come from 
open or reopen) so that they aren't closed until they are no logner in use.

I'm not seeing any evidence of a bug here, pelase reopen if you can point to a 
concrete example of where an IndexReader is being leaked.

 When IndexReader.reopen is called, old reader is not properly closed
 

 Key: SOLR-1808
 URL: https://issues.apache.org/jira/browse/SOLR-1808
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4
Reporter: John Wang

 According to Lucene documentation:
 If the index has not changed since this instance was (re)opened, then this 
 call is a NOOP and returns this instance. Otherwise, a new instance is 
 returned. The old instance is not closed and remains usable.
 In SolrCore.java:
 if (newestSearcher != null  solrConfig.reopenReaders
indexDirFile.equals(newIndexDirFile)) {
 IndexReader currentReader = newestSearcher.get().getReader();
 IndexReader newReader = currentReader.reopen();
 if (newReader == currentReader) {
   currentReader.incRef();
 }
 tmp = new SolrIndexSearcher(this, schema, main, newReader, true, 
 true);
   }
 When currentReader!=newReader, currentReader seems to be leaking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1807) UpdateHandler plugin is not fully supported

2010-03-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843356#action_12843356
 ] 

Hoss Man commented on SOLR-1807:


No all plugin APIs are created equal ... some like TokenizerFactories are 
designed to be extended by lots of people, others werent' particularly well 
thought out abstractions in the first place, and your milage may vary when 
implementing them -- feel free to post doc patches/suggestions to help make 
this more clear.

as to the specific problem...

Even if UpdateHandler had been an abstract class, at best we could have added a 
version of {{forceOpenWriter()}} that just threw an UnsupportedOpException -- 
there's no default impl we could have provided that would have worked for any 
possible UpdateHandler subclass people might have written.

The best conceivable solution we probably could have come up with at the time 
would be to introduce a marker interface that UpdateHandlers could optionaly 
implement containing the APIs needed to support replication, and make the 
ReplicationHandler test the registered UpdateHandler on startup to see if it 
implements that API, and if not then throw an error.

This type of solution could still be implemented today, in place of the 
instanceof DirectUpdateHandler2 ... particularly now that the code has been 
vetted a little bit by users and we have a pretty good idea of what type of 
functionality an UpdateHandler needs to support in order to play nice with 
ReplicationHandler.



 UpdateHandler plugin is not fully supported
 ---

 Key: SOLR-1807
 URL: https://issues.apache.org/jira/browse/SOLR-1807
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.4
Reporter: John Wang

 UpdateHandler is published as a supported Plugin, but code such as the 
 following:
 if (core.getUpdateHandler() instanceof DirectUpdateHandler2) {
 ((DirectUpdateHandler2) 
 core.getUpdateHandler()).forceOpenWriter();
   } else {
 LOG.warn(The update handler being used is not an instance or 
 sub-class of DirectUpdateHandler2.  +
 Replicate on Startup cannot work.);
   } 
 suggest that it is really not fully supported.
 Must all implementations of UpdateHandler be subclasses of 
 DirectUpdateHandler2 for it to work with replication?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843373#action_12843373
 ] 

Hoss Man commented on SOLR-1743:


bq. Okay, I sat down and thought about what we should do before really reading 
through your suggestion - and I came up with practically the exact same thing - 
so I think this is what we should attempt.

I know i brought it up here in the issue comments, but I think we should 
probably track this type of change in a separate issue as an Improvement

for the scope of this issue, let's start by getting a simpler patch commited 
that at least restores the behavior from 1.4 -- w/solr.xml you always get 
missing core name on config error, w/o solr.xml you get good error messages 
even if solrconfig.xml can't be parsed.  It won't help new users in who start 
with the current example from the trunk (since it has a solr.xml) but it will 
get things back to where they were for existing users who try upgrading.

As i recall one of the patches already posted does this just fine (i just can't 
remember which one) so that part should be fairly straight forward.

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch, SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1805) Possible perf improvement in UnInvertedField by using syncronizing on CreationPlaceholder

2010-03-04 Thread Hoss Man (JIRA)
Possible perf improvement in UnInvertedField by using syncronizing on 
CreationPlaceholder
-

 Key: SOLR-1805
 URL: https://issues.apache.org/jira/browse/SOLR-1805
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Hoss Man


UnInvertedField.getUnInvertedField could probably see some performance 
improvements in the creation of new UnInvertedField instances if it started 
synchronizing on a CreationPlaceholder object akin to what FieldCacheImpl 
does...

http://old.nabble.com/Why-synchronized-access-to-FieldValueCache-in-getUninvertedField.java-to27672399.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1772) UpdateProcessor to prune empty values

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839812#action_12839812
 ] 

Hoss Man commented on SOLR-1772:


bq. I'd almost rather see the default behavior changed rather than to put 
another configurable component in the chain that would slow things down 
(slightly) for everyone.

That seems backwards -- if FieldType(s) start checking for the empty string, 
that's a few extra cycles of cost that everyone spends even if their indexing 
clients are already well behaved and only send real values.

Adding it as an optional UpdateProcessor makes it something that only people 
who need hand holdinghave to spend cycles on.

bq. ... confused that the empty string was being indexed at all, for fields 
that aren't even numbers. They thought this was equivalent to not sending it 
any value. I haven't verified this first hand but I believe it.

Nope: there are many use cases for both strings and numbers where you may need 
to skip a value in a multiValued field -- parallel arrays and such. ... it's 
actually one main situations we still have where IntField comes in handy 
(besides just supporting completely legacy Lucene indexes)

 UpdateProcessor to prune empty values
 ---

 Key: SOLR-1772
 URL: https://issues.apache.org/jira/browse/SOLR-1772
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man

 Users seem to frequently get confused when some FieldTypes (typically the 
 numeric ones) complain about invalid field values when the inadvertantly 
 index an empty string.
 It would be cool to provide an UpdateProcessor that makes it easy to strip 
 out any fields being added as empty values ... it could be configured using 
 field (and/or field type) names or globs to select/ignore certain fields -- i 
 haven't thought it through all that hard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1553) extended dismax query parser

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1553:
---

Attachment: edismax.unescapedcolon.bug.test.patch

On the train this past weekend i started trying to tackle the issue of making 
support for field based queries (ie: fieldA:valueB) configurable so that it 
could be turned on/off for certain fields (or left off completely for 
back-compat with dismax)

Based on yonik's description of edismax, and my initial reading of the code 
(particularly the use of clause.field and getFieldName in 
ExtendedDismaxQParser) i was under the impression that if a clause consisting 
of FOO:BAR was encountered, and FOO was not a known field, that the clause 
would be treated as a literal, and the colon would be escaped before passing it 
on to ExtendedSolrQueryParser ... essentially that FOO:BAR and FOO\:BAR would 
be equivalent if FOO is not the name of a real field according to the 
IndexSchema.

For reasons I don't fully understand yet, this isn't the case -- as the 
attached test shows, the queries are parsed differently, and (evidently) 
FOO:BAR is parsed as an empty query if FOO is not a real field.

Before I try digging into this too much, I wanted to sanity check:
* is this expected? ... was this done intentionally?
* is this desired? ... is this logical default behavior to have if the field 
isn't defined? should we have tests to assert this before i start adding more 
config options to change the behavior?

 extended dismax query parser
 

 Key: SOLR-1553
 URL: https://issues.apache.org/jira/browse/SOLR-1553
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.5

 Attachments: edismax.unescapedcolon.bug.test.patch, SOLR-1553.patch, 
 SOLR-1553.pf-refactor.patch


 An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1553) extended dismax query parser

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1553:
---

Attachment: edismax.userFields.patch

FWIW: initial steps towards adding a uf param to let users specify what field 
names can be specified explicitly in the query string, with optional default 
boosts to apply to those clauses ... not finished.

 extended dismax query parser
 

 Key: SOLR-1553
 URL: https://issues.apache.org/jira/browse/SOLR-1553
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.5

 Attachments: edismax.unescapedcolon.bug.test.patch, 
 edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch


 An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1796) Lucene -dev versions should be in the SNAPSHOT apache maven repo.

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1796.


Resolution: Not A Problem

Lucene 2.9.1-dev jars were used by Solr trunk temporarily during the vote of 
Lucene 2.9.2, which is now official, and the jars have been switched.

Dealing with any similar future instances would probably need to be dealt with 
by filing a Lucene issue to get the release candiate jars published to the 
maven repository.

 Lucene -dev versions should be in the SNAPSHOT apache maven repo.
 ---

 Key: SOLR-1796
 URL: https://issues.apache.org/jira/browse/SOLR-1796
 Project: Solr
  Issue Type: Task
  Components: Build
Reporter: David Smiley

 Lucene 2.9.1 is out of course and in maven repos but the 2.9.1-dev as found 
 in Solr's source control right now is not.  This is pretty frustrating and I 
 can only expect it will be a recurring problem.  If Solr is going to use 
 lucene -dev versions then I think Solr needs to put them in a repo somewhere. 
  Apache's snapshot repo would make the most sense.
 FYI the repo manager is now managed by Nexus at this URL: 
 https://repository.apache.org/index.html#nexus-search;quick~lucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839909#action_12839909
 ] 

Hoss Man commented on SOLR-1752:


Long term, we could evolve the Solr XML Update format to allow both adds and 
deletes (and we probably should) but that seems like a seperate issue.

Given the current state of hte XML Syntax allowed, it does seem like there is a 
bug here in that SolrJ will attempt to send illegal XML when it gets an 
UpdateRequest that contains both adds and deletes.

At a minimum SolrJ should notice when it's configured to use XML and the 
UpdateRequest contains mixed commands and generate a more specific error 
message before ever attempting to format the commands as XML and send them to a 
server.

It might conceivable make sense to convert the UpdateRequest into multiple 
server calls -- but i haven't thought that through very far and i'm not sure 
what that would entail (the error handling would probably be a bit tricky)

 SolrJ fails with exception when passing document ADD and DELETEs in the same 
 request using XML request writer (but not binary request writer)
 -

 Key: SOLR-1752
 URL: https://issues.apache.org/jira/browse/SOLR-1752
 Project: Solr
  Issue Type: Bug
  Components: clients - java, update
Affects Versions: 1.4
Reporter: Jayson Minard
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Add this test to SolrExampleTests.java and it will fail when using the XML 
 Request Writer (now default), but not if you change the SolrExampleJettyTest 
 to use the BinaryRequestWriter.
 {code}
  public void testAddDeleteInSameRequest() throws Exception {
 SolrServer server = getSolrServer();
 SolrInputDocument doc3 = new SolrInputDocument();
 doc3.addField( id, id3, 1.0f );
 doc3.addField( name, doc3, 1.0f );
 doc3.addField( price, 10 );
 UpdateRequest up = new UpdateRequest();
 up.add( doc3 );
 up.deleteById(id001);
 up.setWaitFlush(false);
 up.setWaitSearcher(false);
 up.process( server );
   }
 {code}
 terminates with exception:
 {code}
 Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
 (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
   at org.mortbay.jetty.Server.handle(Server.java:285)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
   at 
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
 Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
 roots (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at 
 com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
   at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
   at 
 com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)
   at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
   at 

[jira] Commented: (SOLR-1553) extended dismax query parser

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839914#action_12839914
 ] 

Hoss Man commented on SOLR-1553:


bq. What does u in uf stand for? 

user fields ... as in field names a user may refer to ... but it's not 
something i though through to hard, as i said: work in progress.

 extended dismax query parser
 

 Key: SOLR-1553
 URL: https://issues.apache.org/jira/browse/SOLR-1553
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.5

 Attachments: edismax.unescapedcolon.bug.test.patch, 
 edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch


 An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839928#action_12839928
 ] 

Hoss Man commented on SOLR-1750:



Committed revision 917812.

I went ahead and commited the most recent attachment under the name 
SystemInfoRequestHandler with slightly generalized javadocs.

Leaving the issue open so we make sure to settle the remaining issues before we 
release...

 * decide if we want to change the name
 * add default registration as part of the AdminRequestHandler (ie: /admin/info 
?)
 * add some docs (didn't wnat to make a wiki page until we're certain of hte 
name)
 * decide if we want to modify the response structure (should all of the top 
level info be encapsulated in a container?)


 SystemStatsRequestHandler - replacement for stats.jsp
 -

 Key: SOLR-1750
 URL: https://issues.apache.org/jira/browse/SOLR-1750
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Trivial
 Fix For: 1.5

 Attachments: SystemStatsRequestHandler.java, 
 SystemStatsRequestHandler.java, SystemStatsRequestHandler.java


 stats.jsp is cool and all, but suffers from escaping issues, and also is not 
 accessible from SolrJ or other standard Solr APIs.
 Here's a request handler that emits everything stats.jsp does.
 For now, it needs to be registered in solrconfig.xml like this:
 {code}
 requestHandler name=/admin/stats 
 class=solr.SystemStatsRequestHandler /
 {code}
 But will register this in AdminHandlers automatically before committing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1750) SystemInfoRequestHandler - replacement for stats.jsp and registry.jsp

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1750:
---

Summary: SystemInfoRequestHandler - replacement for stats.jsp and 
registry.jsp  (was: SystemStatsRequestHandler - replacement for stats.jsp)

 SystemInfoRequestHandler - replacement for stats.jsp and registry.jsp
 -

 Key: SOLR-1750
 URL: https://issues.apache.org/jira/browse/SOLR-1750
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Trivial
 Fix For: 1.5

 Attachments: SystemStatsRequestHandler.java, 
 SystemStatsRequestHandler.java, SystemStatsRequestHandler.java


 stats.jsp is cool and all, but suffers from escaping issues, and also is not 
 accessible from SolrJ or other standard Solr APIs.
 Here's a request handler that emits everything stats.jsp does.
 For now, it needs to be registered in solrconfig.xml like this:
 {code}
 requestHandler name=/admin/stats 
 class=solr.SystemStatsRequestHandler /
 {code}
 But will register this in AdminHandlers automatically before committing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1797) ConcurrentModificationException

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839943#action_12839943
 ] 

Hoss Man commented on SOLR-1797:


NOTE: Initial thread where Yonik had some more comments about how/where the 
concurrent modification can come from...

http://old.nabble.com/ConcurrentModificationException-to27722422.html


 ConcurrentModificationException
 ---

 Key: SOLR-1797
 URL: https://issues.apache.org/jira/browse/SOLR-1797
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.4, 1.5
 Environment: Centos 5, Tomcat 6
Reporter: Dan Hertz
Priority: Blocker

 SOLR 1.4 (final) and 1.5 nightly work fine on a Windows box, but on our 
 Centos 5 box, we're getting a ConcurrentModificationException when starting 
 Tomcat 6.
 Yonik Seeley asked me to start a JIRA bug report, mentioning that, It looks 
 like resourceLoader.newInstance() is fundamentally not thread safe. 
 (SOLR-USER)
 = = =  Log Below: = = =
 INFO   | jvm 1| 2010/02/24 21:27:04 | SEVERE: 
 java.util.ConcurrentModificationException
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 java.util.AbstractList$Itr.next(AbstractList.java:343)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:507)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:606)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.core.CoreContainer.load(CoreContainer.java:285)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4450)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardService.start(StandardService.java:516)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 org.apache.catalina.startup.Catalina.start(Catalina.java:583)
 INFO   | jvm 1| 2010/02/24 21:27:04 | at 
 

[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839948#action_12839948
 ] 

Hoss Man commented on SOLR-1743:


Mark: i'm confused by your comments/patch

I applied your patch allong with the schema.xml typo patch i posted above to 
Solr trunk (r917814) and still got missing core name in path when hitting 
http://localhost:8983/solr/admin/

I thought that since example/solr/conf/solrconfig.xml uses 
{{abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError}}
 it would fall into the situation you described as being fixed?

(Did you maybe attach a different version of the patch then you ment to?)


 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1802) Make Solr work with IndexReaderFactory implementations that return MultiReader

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1802:
---

Description: 
When an IndexReaderFactory returns an instance of MultiReader, various places 
in Solr try to call reader.directory() and reader.getVersion, which results an 
UnsupportedOperationException.



  was:
When an IndexReaderFactory returns an instance of MultiReader, Solr tries to 
call reader.directory() and reader.getVersion, which results an 
UnsupportedOperationException.

Custom IndexReaderFactory implementations that return MultiReader instances are 
common, and I don't there there are documentations that discourage this.


 Issue Type: Improvement  (was: Bug)
Summary: Make Solr work with IndexReaderFactory implementations that 
return MultiReader  (was: Solr is not friend to IndexReaderFactory 
implementations that return MultiReader)

editing issue summary to reflect that this is an improvement, not a bug.

It was noted when IndexReaderFactory was added that using custom factories was 
incompatible with a lot of Solr features precisely because of the assumption 
about reader.directory()...

CHANGES.txt when the API was introduced...
{noformat}
59. SOLR-243: Add configurable IndexReaderFactory so that alternate IndexReader 
implementations 
can be specified via solrconfig.xml. Note that using a custom IndexReader 
may be incompatible
with ReplicationHandler (see comments in SOLR-1366). This should be treated 
as an experimental feature.
(Andrzej Bialecki, hossman, Mark Miller, John Wang)

{noformat}


example solrconfig.xml (the only place the feature is advertised)...
{code}
  !-- Use the following format to specify a custom IndexReaderFactory - allows 
for alternate
   IndexReader implementations.

   ** Experimental Feature **
   Please note - Using a custom IndexReaderFactory may prevent certain 
other features
   from working. The API to IndexReaderFactory may change without warning 
or may even
   be removed from future releases if the problems cannot be resolved.

   ** Features that may not work with custom IndexReaderFactory **
   The ReplicationHandler assumes a disk-resident index. Using a custom
   IndexReader implementation may cause incompatibility with 
ReplicationHandler and
   may cause replication to not work correctly. See SOLR-1366 for details.

  indexReaderFactory name=IndexReaderFactory class=package.class
Parameters as required by the implementation
  /indexReaderFactory 
{code}




 Make Solr work with IndexReaderFactory implementations that return MultiReader
 --

 Key: SOLR-1802
 URL: https://issues.apache.org/jira/browse/SOLR-1802
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: John Wang

 When an IndexReaderFactory returns an instance of MultiReader, various places 
 in Solr try to call reader.directory() and reader.getVersion, which results 
 an UnsupportedOperationException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840014#action_12840014
 ] 

Hoss Man commented on SOLR-1743:


No worries dude .. i don't really even understand how it worked before, let 
alone with your patch.

Your latest version deals with my typo in schema.xml example, but when testing 
out some other use cases it looks like the default assumption was that 
abortOnConfigurationError=true unless the solrconfig.xml can be parsed cleanly 
and sets it to false ... which means that in 1.4 a single core malformed 
solrconfig.xml (ie: garbage in the prolog) would generate a good error message 
-- and with your latest patch it still generates the missind core name error.

It seems like in order to preserve we need to use tertiary state for the 
CoreContainer.abortOnConfigurationError ... null assumes true until at least 
one solrocnfig.xml is parsed cleanly, then false unless at least one config 
sets it to true.

I'm also wondering if your patch breaks the purpose of 
CoreContainer.Initializer.setAbortOnConfigurationError ... i think the idea 
there was that prior to initializing the CoreContainer, Embedded Solr users 
could call that method to force abortOnConfigurationError even if it wasn't set 
in any of hte solrconfig.xml files.


 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1743:
---

Attachment: SOLR-1743.patch

Ok, here's my attempt at making sense of this.

As far as i can tell this restores all of the useful behavior that SOlr 1.4 had 
with abortOnConfirurationError in single core mode ... some quick multicore 
testing makes me think it's improved the error reporting in some situations 
there as well, but i'm sure i haven't tried all of the edge cases -- it may 
have broken something.


 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840021#action_12840021
 ] 

Hoss Man commented on SOLR-1743:


G... ignore that last patch, it changes the default behavior to be like 
abortOnConfigurationError=true for multicores even if no core ever asked for it 
... which would be bad (in 1.4 those cores will all still load, but with this 
patch they won't)

Still thinking about it

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1743:
---

Attachment: SOLR-1743.patch

I think i give up.

First off: sorry mark, this comment was way off base...

bq. I'm also wondering if your patch breaks the purpose of 
CoreContainer.Initializer.setAbortOnConfigurationError 

...digging through the history i realized that this is how Initializer has 
always worked: you can set the default behavior for legacy single core mode, 
but whenever it sees a solr.xml file it overwrites that default value with 
false

This is fundamentally what's bitch slapping me at the moment ... the attached 
patch tries to mimic the historical behavior, and i think i saw it work (but 
i'm kinda cross-eeyed right now so i can honestly say you shouldn't take my 
word for it -- i wouldn't) but it doesn't really address the fact that since 
the example now contains a solr.xml, anybody who starts with the Solr 1.5 
example and makes a typo in their solrconfig.xml so that it's not well formed 
won't get a useful error message in the browser like they would in Solr 1.4



 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch, SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840045#action_12840045
 ] 

Hoss Man commented on SOLR-1743:


Okay now i'm just going to rant...

abortOnConfigurationError feels like it's just devolved into nonsense at this 
point .. the orriginal purpose was to let people configure wether they wanted 
to solr to try to keep running even if something like a request handler 
couldn't be loaded -- set it to true and solr wouldn't start up and the admin 
screen would tell you why, set it to false and solr would work, but requests 
for that request handler would fail

once we added multicore support, the usage of abortOnConfigurationError just 
stoped making sense ... if your solr.xml refers to just core1, and core1's 
solrconfig.xml sets it to false and has a request handler that can't be 
llooaded things keep working -- but if you also have a core2 whose 
solrconfig.xml sets it to true then the whole server won't start up ... 
that's just silly.

Maybe it's just time to rethink the whole damn thing...
* deprecate the SolrConfig.SEVERE_ERRORS singleton - make SolrCore start 
keeping a personal list of exceptions it was able to get past (ie: a plugin it 
couldn't load)
* Eliminate Initializer.isAbortOnConfigurationError - instead make each 
SolreCore keep track of that itself
* if initializing a core throws an exception (either from parsing the config, 
or from instantiating the SolrCore or IndexSchema) CoreContainer should keep 
track of that exception as being specific to that core name 
(MapString,Exception)
** removing a core, or recreating a core with the same name should clear any 
corresponding entry from this map
* when SolrDispatchFilter processes a path, it should generate a useful error 
message in either case:
** CoreContainer says it has an init exception for the core name that 
corresponds to that path
** the SolrCore exists; has sAbortOnConfigurationError()=true; and has a 
non-empty list of exceptions


...thoughts?

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
Assignee: Mark Miller
 Fix For: 1.5

 Attachments: SOLR-1743.patch, SOLR-1743.patch, SOLR-1743.patch, 
 SOLR-1743.patch, SOLR-1743.patch


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory

2010-02-23 Thread Hoss Man (JIRA)
Document peculiar behavior of TestHarness.LocalRequestFactory
-

 Key: SOLR-1792
 URL: https://issues.apache.org/jira/browse/SOLR-1792
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4, 1.3, 1.2, 1.1.0
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5


While working on a test case, i realized that due to method evolution, 
TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that 
results in the defaults the factory was configured with being ignored when 
the method is called with multiple varargs.

I spent some time attempting to fix this by adding the defaults to the end of 
the params, but then discovered that this breaks existing tests because the LRF 
defaults take precedence over defaults that may be hardcoded into the 
solrconfig.xml.  The internal test might be changed to work arround this, but i 
didn't want to risk breaking tests for users who might be using TestHarness 
directly.

So this bug is just to track improving the documentation of what exactly 
LRF.makeRequest does with it's input

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory

2010-02-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1792:
---

Attachment: SOLR-1792.patch

patch ... i would have already committed this but SVN seems to be down.

 Document peculiar behavior of TestHarness.LocalRequestFactory
 -

 Key: SOLR-1792
 URL: https://issues.apache.org/jira/browse/SOLR-1792
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.1.0, 1.2, 1.3, 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5

 Attachments: SOLR-1792.patch


 While working on a test case, i realized that due to method evolution, 
 TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that 
 results in the defaults the factory was configured with being ignored when 
 the method is called with multiple varargs.
 I spent some time attempting to fix this by adding the defaults to the end 
 of the params, but then discovered that this breaks existing tests because 
 the LRF defaults take precedence over defaults that may be hardcoded into the 
 solrconfig.xml.  The internal test might be changed to work arround this, but 
 i didn't want to risk breaking tests for users who might be using TestHarness 
 directly.
 So this bug is just to track improving the documentation of what exactly 
 LRF.makeRequest does with it's input

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1792) Document peculiar behavior of TestHarness.LocalRequestFactory

2010-02-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1792.


Resolution: Fixed

Committed revision 915637.

 Document peculiar behavior of TestHarness.LocalRequestFactory
 -

 Key: SOLR-1792
 URL: https://issues.apache.org/jira/browse/SOLR-1792
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.1.0, 1.2, 1.3, 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5

 Attachments: SOLR-1792.patch


 While working on a test case, i realized that due to method evolution, 
 TestHarness.LocalRequestFactory.makeRequest has some really odd behavior that 
 results in the defaults the factory was configured with being ignored when 
 the method is called with multiple varargs.
 I spent some time attempting to fix this by adding the defaults to the end 
 of the params, but then discovered that this breaks existing tests because 
 the LRF defaults take precedence over defaults that may be hardcoded into the 
 solrconfig.xml.  The internal test might be changed to work arround this, but 
 i didn't want to risk breaking tests for users who might be using TestHarness 
 directly.
 So this bug is just to track improving the documentation of what exactly 
 LRF.makeRequest does with it's input

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1776) dismax should treate schema's default field as a default qf

2010-02-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1776.


   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Hoss Man

Committed revision 915646.

 dismax should treate schema's default field as a default qf
 ---

 Key: SOLR-1776
 URL: https://issues.apache.org/jira/browse/SOLR-1776
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5


 the DismaxQParser is completely useless w/o specifying the qf param, but 
 for the life of me i can't think of any good reason why it shouldn't use 
 IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1776) dismax and edismax should treate schema's default field as a default qf

2010-02-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1776:
---

Description: the DismaxQParser (and ExtendedDismaxQParser) is completely 
useless w/o specifying the qf param, but for the life of me i can't think of 
any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as 
teh default value for hte qf param.  (was: the DismaxQParser is completely 
useless w/o specifying the qf param, but for the life of me i can't think of 
any good reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as 
teh default value for hte qf param.)
Summary: dismax and edismax should treate schema's default field as a 
default qf  (was: dismax should treate schema's default field as a default qf)

 dismax and edismax should treate schema's default field as a default qf
 ---

 Key: SOLR-1776
 URL: https://issues.apache.org/jira/browse/SOLR-1776
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5


 the DismaxQParser (and ExtendedDismaxQParser) is completely useless w/o 
 specifying the qf param, but for the life of me i can't think of any good 
 reason why it shouldn't use IndexSchema.getDefaultSearchFieldName() as teh 
 default value for hte qf param.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1687) add param for limiting start and rows params

2010-02-22 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836930#action_12836930
 ] 

Hoss Man commented on SOLR-1687:


bq. Traditionally, this type of stuff is delegated to the front-end clients to 
restrict.

True, but my suggestion wasn't so much along the lines of end users entering 
really big numbers as much as that client developers might make mistakes, and 
this would allow a solr admin to lock things down in a sane way.

bq. Would it make more sense to add an optional component to check 
restrictions? The restrictions could optionally be in the config for the 
component and thus wouldn't have to be looked up and parsed for every request.

I like this idea, but given the way local versions of start/rows are treated 
special wouldn't we still need special like what i added in the patch to deal 
with them?  (a generic component added to the front of the list could check 
validate a list of global params, but it wouldn't have anyway of knowing for 
certain what other params later components might parse with a QParser.

 add param for limiting start and rows params
 

 Key: SOLR-1687
 URL: https://issues.apache.org/jira/browse/SOLR-1687
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-1687.patch


 conventional wisdom is that it doesn't make sense to paginate with huge 
 pages, or to drill down deep into high numbered pages -- features like 
 faceting tend to be a better UI experience, and less intensive on solr.
 At the moment, Sold adminstrators can use invariant params to hardcode the 
 rows param to something reasonable, but unless they only want to allow 
 users to look at page one, the can't do much to lock down the start param 
 expect inforce these rules in the client code
 we should add new params that set an upper bound on both of these, which can 
 then be specified as default/invarient params in solrconfig.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1772) UpdateProcessor to prune empty values

2010-02-22 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836933#action_12836933
 ] 

Hoss Man commented on SOLR-1772:


Actually my point is that the new FieldTypes make it  *more* of an issue (in 
the eyes of end users) because now Solr errors out on empty (numeric) field 
values ... having an UpdateProcessor like this would be an easy solution for 
people who just want a simple way to tell Solr to ignore empty fields (with 
certain names, or certain types)

 UpdateProcessor to prune empty values
 ---

 Key: SOLR-1772
 URL: https://issues.apache.org/jira/browse/SOLR-1772
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man

 Users seem to frequently get confused when some FieldTypes (typically the 
 numeric ones) complain about invalid field values when the inadvertantly 
 index an empty string.
 It would be cool to provide an UpdateProcessor that makes it easy to strip 
 out any fields being added as empty values ... it could be configured using 
 field (and/or field type) names or globs to select/ignore certain fields -- i 
 haven't thought it through all that hard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1786) Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()] fixed in PDFbox 1.0?

2010-02-22 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1786:
---

Fix Version/s: 1.5

marking Fix for 1.5 -- we shouldn't release w/o either moving forward or 
rollingback the version we use.

(FYI: our PDFBox dependency is based on the tika dependency)

 Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in 
 org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()]  fixed in PDFbox 
 1.0?
 

 Key: SOLR-1786
 URL: https://issues.apache.org/jira/browse/SOLR-1786
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 1.5
 Environment: Ubuntu 9.10, 32bit
Reporter: Jan Iwaszkiewicz
Priority: Critical
 Fix For: 1.5


 I tried indexing several thousand PDF documents but could not finish as Solr 
 was falling into an endless loop for some of them, for instance: 
 http://cdsweb.cern.ch/record/702585/files/sl-note-2000-019.pdf (the PDF seems 
 OK).
 Can Solr start using PDFbox 1.0?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field

2010-02-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835334#action_12835334
 ] 

Hoss Man commented on SOLR-1695:


Doh!

Note to self: don't just run the tests, remember to look at the results as well.

The DocumentBuilderTest failures make sense: they use a schema with uniqueKey 
defined, but add docs w/o that field to test other behaviors of toDocument.  
They passed prior to this change because the only tested to toDocument method 
in isolation, andthe test for a missing uniqueKey was missing from that method. 
 I think it's safe to consider these tests broken as written, since toDocument 
does do schema validation -- it just wasn't doing the uniqueKey validation 
before.  So i'll modify those tests to include a value for the uniqueKey field

the ConvertedLegacyTest failure confuses me though ... it also adds docs w/o a 
uniqueKey field even though the schema requires one, but they do full adds so 
it's not obvious from the surface why it was ever passing before ... i want to 
think about that a little more before just fixing' the test -- it may be 
masking another bug.

  Missleading error message when adding docs with missing/multiple value(s) 
 for uniqueKey field
 --

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field

2010-02-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835364#action_12835364
 ] 

Hoss Man commented on SOLR-1695:


Hmmm ok so the reason the legacy test passed prior to this change is that 
DirectUpdateHandler2 (and DirectUpdateHandler from what i can tell) don't 
bother checking for a uniqueKey (or for multiple uniqueKeys) if 
allowDups=true (which it is in the line of ConvertedLEgacyTest that's 
failing).

So the question becomes: Is it a bug that DUH(2) allow docs w/o a uniqueKey 
field just because allowDups=true?

If it's not a bug, then this entire patch should probably be rolled back -- but 
personally It feels like it really is a bug: if a schema declares a uniqueKey 
field, then just because a particular add command says allowDups=true doesn't 
mean that docs w/o an id (or with multiple ids) should be allowed in to the 
index -- those docs will need meaningful ids if/when a later commit does want 
to override them (consider the case of doing an initial build w/ allowDups=true 
for speed, and then incremental updates w/ allowDups=false ... the index needs 
to be internally consistent.

Actually: I'm just going to roll this entire patch back either way -- we can 
improve the error messages generated by DirectUpdateHandler2 and eliminate the 
redundant uniqueKey check in DocumentBuilder.toDocument.  As a separate issue 
we can consider whether DUH2 is buggy.

  Missleading error message when adding docs with missing/multiple value(s) 
 for uniqueKey field
 --

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field

2010-02-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835373#action_12835373
 ] 

Hoss Man commented on SOLR-1695:


bq. schema.xml does not require the id field, and the failing add explicitly 
says allowDups=false (legacy speak for overwrite=false)

...it doesn't require id but it does declare id as the uniqueKey field ... 
even if it's allowing dups shouldn't it ensure that the docs has 1 and only one 
value for the uniqueKey field?

  Missleading error message when adding docs with missing/multiple value(s) 
 for uniqueKey field
 --

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field

2010-02-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1695.


Resolution: Fixed

Committed revision 911595.

rolledback the changes to DocumentBuilder and improved the existing error 
messages in UpdateHandler instead.

  Missleading error message when adding docs with missing/multiple value(s) 
 for uniqueKey field
 --

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1780) existence of exactly one value for uniqueKey field is not checked when overwrite=false or allowDups=true

2010-02-18 Thread Hoss Man (JIRA)
existence of exactly one value for uniqueKey field is not checked when 
overwrite=false or allowDups=true


 Key: SOLR-1780
 URL: https://issues.apache.org/jira/browse/SOLR-1780
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


As noted in SOLR-1695, DirectUpdateHandler(2) when a document is added, the 
uniqueKey field is only asserted to contain exactly one value if 
overwrite=true.  If overwrite=false (or allowDups=true) then the uniqueKey 
field is not checked at all.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1777) fields with sortMissingLast don't sort correctly

2010-02-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835424#action_12835424
 ] 

Hoss Man commented on SOLR-1777:


Yonik: just to be verify: was this bug was introduced in Solr 1.4? ... 
presumably because of the changes to per segment collecting?

(that's the way the Affects Version/s is marked, but i want to sanity check 
in case it was actually a more fundamental problem affecting earlier versions 
of Solr as well).

 fields with sortMissingLast don't sort correctly
 

 Key: SOLR-1777
 URL: https://issues.apache.org/jira/browse/SOLR-1777
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4
Reporter: Yonik Seeley
Assignee: Yonik Seeley
Priority: Critical
 Fix For: 1.5

 Attachments: SOLR-1777.patch, SOLR-1777.patch


 field types with the sortMissingLast=true attribute can have results sorted 
 incorrectly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

2010-02-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834938#action_12834938
 ] 

Hoss Man commented on SOLR-1365:


The constraints on what can be SolrCoreAware exist for two main reasons:

 # to ensure some sanity in initialization .. one of the main reasons the 
SolrCoreAware interface was needed in the first place was because some plugins 
wanted to use the SolrCore to get access to other plugins during their 
initialization -- but those other components weren't necessarily initialized 
yet.  with the inform(SolrCore) method SolrCoreAware plugins know that all 
other components have been initialized, but they haven't necessarily been 
informed about the SolrCore, so they might not be ready to deal with other 
plugins yet ... it's generally just a big initialization-cluster-fuck, so the 
fewer classes involved the better
 # prevent too much pollution of the SolrCore API.  having direct access to the 
SolrCore is a big deal -- once you have a reference to the core, you can get 
to pretty much anything, which opens us (ie: Solr maintainers) up to a lot of 
crazy code paths to worry about -- so the fewer plugin types that we need to 
consider when making changes to SolrCore the better.

In the case of SimilarityFactor, i'm not entirely sure how i feel about making 
it SolrCoreAware(able) ... we have tried really, REALLY hard to make sure 
nothing initialized as part of the IndexSchema can be SolrCore aware because it 
opens up the possibility of plugin behavior being affected by SolrCore 
configuration which might be differnet between master and slave machines -- 
which could provide disastrous results.  a schema.xml needs to be internally 
consistent regardless of what solrconfig.xml might refrence it.

In this case the real issue isn't that we have a use case where 
SImilarityFactory _needs_ access to SolrCore -- what it wants access to is the 
IndexSchema, so it might make sense to just provide access to that in some way 
w/o having to expos the entire SolrCore.

Practically speaking, after re-skimming the patch: I'm not even convinced that 
would eally add anything.  refactoring/reusing some of the *code* that 
IndexSchema uses to manage dynamicFIelds might be handy for the 
SweetSpotSimilarityFactory, but i don't actual see how being able to inspect 
the IndexSchema to get the list of dynamicFields (or find out if a field is 
dynamic) would make it any better or easier to use.  We'd still want people to 
configure it with field names and field name globs directly because there won't 
necessarily be a one to one correspondence between what fields are dynamic in 
the schema and how you want the sweetspots defined ... you might have a generic 
en_* dynamicField in your schema for english text, and an fr_* dynamicField 
for french text, but that doesn't mean the sweetspot for all fr_* fields will 
be the same ... you are just as likely to want some very specific field names 
to have their own sweetspot, or to have the sweetspot be suffix based (ie: 
*_title could have one sweetspot even the resulting field names are fr_title 
and en_title.

I think the patch could be improved, and i think there is definitely some code 
reuse possibility for parsing the field name globs, but i don't know that it 
really needs run time access to the IndexSchema (and it definitely doesn't need 
access to the SolrCore)

 Add configurable Sweetspot Similarity factory
 -

 Key: SOLR-1365
 URL: https://issues.apache.org/jira/browse/SOLR-1365
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.3
Reporter: Kevin Osborn
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1365.patch


 This is some code that I wrote a while back.
 Normally, if you use SweetSpotSimilarity, you are going to make it do 
 something useful by extending SweetSpotSimilarity. So, instead, I made a 
 factory class and an configurable SweetSpotSimilarty. There are two classes. 
 SweetSpotSimilarityFactory reads the parameters from schema.xml. It then 
 creates an instance of VariableSweetSpotSimilarity, which is my custom 
 SweetSpotSimilarity class. In addition to the standard functions, it also 
 handles dynamic fields.
 So, in schema.xml, you could have something like this:
 similarity class=org.apache.solr.schema.SweetSpotSimilarityFactory
 bool name=useHyperbolicTftrue/bool
   float name=hyperbolicTfFactorsMin1.0/float
   float name=hyperbolicTfFactorsMax1.5/float
   float name=hyperbolicTfFactorsBase1.3/float
   float name=hyperbolicTfFactorsXOffset2.0/float
   int name=lengthNormFactorsMin1/int
   int name=lengthNormFactorsMax1/int
   float name=lengthNormFactorsSteepness0.5/float
   int name=lengthNormFactorsMin_description2/int
   int 

[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

2010-02-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835008#action_12835008
 ] 

Hoss Man commented on SOLR-741:
---

bq. With the introduction of Trie fields is it not irrelevant now? can we close 
it 

TrieFields make it more efficient to do range searches on numeric fields 
indexed at full precision, but it doesn't actually do anything to round the 
fields for people who genuinely want their stored and index values to only have 
second/minute/hour/day precision regardless of what the initial raw data looks 
like.

So while TrieFields definitely make this less of a priority from a performance 
standpoint, it doens't solve hte full problem.

(Unless i'm missing something, actually rounding the values prior to indexing 
will still help improve performance in general because it will reduce the total 
number of Terms ... with TrieFields isn't the original value is always indexed 
regardless of the precisionStep?

 Add support for rounding dates in DateField
 ---

 Key: SOLR-741
 URL: https://issues.apache.org/jira/browse/SOLR-741
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5


 As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
 Since rounding dates to a coarse value is an often recommended solution to 
 decrease number of unique terms, we should add support for doing this in 
 DateField itself. A number of syntax were proposed, some of them were:
 # fieldType name=date class=solr.DateField 
 sortMissingLast=trueomitNorms=true roundTo=-1MINUTE / (Shalin)
 # fieldType name=date class=solr.DateField sortMissingLast=true 
 omitNorms=true round=DOWN_MINUTE / (Otis)
 Hoss proposed more general enhancements related to arbitary pre-processing of 
 values prior to indexing/storing using pre-processing analyzers.
 This issue aims to build a consensus on the solution to pursue and to 
 implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-270) dismax handler should not log a warning when sort by score desc is specified

2010-02-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-270.
---

Resolution: Fixed

This was fixed at some point as a side effect of some other change.

 dismax handler should not log a warning when sort by score desc is specified
 

 Key: SOLR-270
 URL: https://issues.apache.org/jira/browse/SOLR-270
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Priority: Minor

 http://localhost:8983/solr/select/?indent=onq=videosort=score+descqt=dismax
 causes a warning to be logged...
 WARNING: Invalid sort score desc was specified, ignoring
 ..because of some excentricities in how the getSort method works ... this 
 warning is distracting and missleading ... but only in the case where score 
 desc is used ... it should still be generated for a truely invalid sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1679) SolrCore.execute should wrap log message construction in if (log.isInfoEnabled())

2010-02-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1679.


   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Hoss Man

Committed revision 911216.

Thanks for the suggestion Fuad.

 SolrCore.execute should wrap log message construction in if 
 (log.isInfoEnabled())
 ---

 Key: SOLR-1679
 URL: https://issues.apache.org/jira/browse/SOLR-1679
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5

 Attachments: SOLR-1679.patch


 As mentioned by Fuad on solr-user, there is some non-trivial log message 
 construction happening in SolreCore.execute that should be wrapped in if 
 (log.isInfoEnabled()) ...
 http://old.nabble.com/SOLR-Performance-Tuning%3A-Disable-INFO-Logging.-to26866730.html#a26866943
 ...the warn level message in that same method could probably also be wrapped 
 since it does some large string building as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1695) Missleading error message when uniqueKey is field is missing

2010-02-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1695.


   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Hoss Man

Committed revision 911228.
Committed revision 911232.

I added an explicit checks for the number of uniqueKey values being != 1 early 
on in DocumentBuilder.toDocument.  Prior to this, multiple values weren't 
checked for until the doc made it all the way to the UpdateHandler.

  Missleading error message when uniqueKey is field is missing
 -

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1695) Missleading error message when adding docs with missing/multiple value(s) for uniqueKey field

2010-02-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1695:
---

Summary:  Missleading error message when adding docs with missing/multiple 
value(s) for uniqueKey field  (was:  Missleading error message when uniqueKey 
is field is missing)

revising summary

  Missleading error message when adding docs with missing/multiple value(s) 
 for uniqueKey field
 --

 Key: SOLR-1695
 URL: https://issues.apache.org/jira/browse/SOLR-1695
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5


 Sometimes users don't seem to notice/understand the uniqueKey/ declaration 
 in the example schema, and the error message they get if their documents 
 don't include that field is confusing...
 {code}
 org.apache.solr.common.SolrException: Document [null] missing required field: 
 id
 {code}
 ...because they get an almost identical error even if they remove 
 {{required=true}} from {{field name=id /}} in their schema.xml file.
 We should improve the error message so it's clear when a Document is missing 
 the uniqueKeyField (not just a required field) so they know the 
 terminology to look for in diagnosing the problem.
 http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-to26990048.html#a26990779

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1687) add param for limiting start and rows params

2010-02-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835075#action_12835075
 ] 

Hoss Man commented on SOLR-1687:


Hmmm... QParser.getSort is where the current sort/start/rows param parsing 
happens right now, but looking at it makes me realize there's some local params 
semantics to consider with something like this.

Currently, QParser.getSort won't consult the global params if any of 
sort/start/rows are specified as a local param (or if the caller explicitly 
says useGlobalParams=false, but there doesn't seem to be a code path where that 
happens)

but what should happen in these situations...

{code}
#1) q={!lucene rows.max=99 rows=}foorows.max=100
#2) q={!lucene rows.max=100 v=$qq}qq=foorows=999rows.max=999
{code}

situation #1 could come up if a greedy client attempted to ask for too many 
rows, and the admin has a configured invariant of rows.max=100 -- in which case 
we'd want the global rows.max param to superseded the local rows param.  But 
situation #2 is equally possible where the q param is an invariant set by the 
admin, and the other params come from a greedy client.

The best situation i can think of off the top of my head would be to ignore 
local param values for start.max and rows.max, and look for them as global 
params even if false==useGlobalParams.  That takes care of situation #1, and 
makes situation #2 easy to deal with by also adding rows.max=100 as an 
invariant outside of the local params.

Anyone see any holes in that?

 add param for limiting start and rows params
 

 Key: SOLR-1687
 URL: https://issues.apache.org/jira/browse/SOLR-1687
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man

 conventional wisdom is that it doesn't make sense to paginate with huge 
 pages, or to drill down deep into high numbered pages -- features like 
 faceting tend to be a better UI experience, and less intensive on solr.
 At the moment, Sold adminstrators can use invariant params to hardcode the 
 rows param to something reasonable, but unless they only want to allow 
 users to look at page one, the can't do much to lock down the start param 
 expect inforce these rules in the client code
 we should add new params that set an upper bound on both of these, which can 
 then be specified as default/invarient params in solrconfig.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1687) add param for limiting start and rows params

2010-02-17 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1687:
---

Attachment: SOLR-1687.patch

patch with the logic i attempted to describe.  it doesn't contain any Unit 
Tests yet, but it seems to be working.

the real question is: are there any any holes i haven't plugged in the 
local/global param handling logic that a greedy client could exploit?

 add param for limiting start and rows params
 

 Key: SOLR-1687
 URL: https://issues.apache.org/jira/browse/SOLR-1687
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-1687.patch


 conventional wisdom is that it doesn't make sense to paginate with huge 
 pages, or to drill down deep into high numbered pages -- features like 
 faceting tend to be a better UI experience, and less intensive on solr.
 At the moment, Sold adminstrators can use invariant params to hardcode the 
 rows param to something reasonable, but unless they only want to allow 
 users to look at page one, the can't do much to lock down the start param 
 expect inforce these rules in the client code
 we should add new params that set an upper bound on both of these, which can 
 then be specified as default/invarient params in solrconfig.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-397) options for dealing with range endpoints in date facets

2010-02-16 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-397:
--

Description: Date faceting should support configuration for controlling how 
edge boundaries are dealt with.  (was: as discussed in email...
http://www.nabble.com/Re%3A-Date-facetting-and-ranges-overlapping-p12928374.html

: I'm now using date facetting to browse events. It works really fine
: and is really useful. The only problem so far is that if I have an
: event which is exactly on the boundary of two ranges, it is referenced
: 2 times.

yeah, this is one of the big caveats with date faceting right now ... i
struggled with this a bit when designing it, and ultimately decided to
punt on the issue.  the biggest hangup was that even if hte facet counting
code was smart about making sure the ranges don't overlap, the range query
syntax in the QueryParser doesn't support ranges that exclude one input
(so there wouldn't be a lot you can do with the ranges once you know the
counts in them)

one idea i had in SOLR-258 was that we could add an interval option that
would define how much to add to the end or one range to get the start
of another range (think of the current implementation having interval
hardcoded to 0) which would solve the problem and work with range
queries that were inclusive of both endpoints, but would require people to
use -1MILLI a lot.

a better option (assuming a query parser change) would be a new option
thta says wether each computed range should be enclusive of the low poin,t
the high point, both end points, neither end points, or be smart (where
smart is the same as low except for the last range where the it includes
both)

(I think there's already a lucene issue to add the query parser support, i
just haven't had time to look at it)

The simple workarround: if you know all of your data is indexed with
perfect 0.000second precision, then put -1MILLI at the end of your start
and end date faceting params.
)

(initial issue description moved to comment)

as discussed in email...
http://www.nabble.com/Re%3A-Date-facetting-and-ranges-overlapping-p12928374.html

: I'm now using date facetting to browse events. It works really fine
: and is really useful. The only problem so far is that if I have an
: event which is exactly on the boundary of two ranges, it is referenced
: 2 times.

yeah, this is one of the big caveats with date faceting right now ... i
struggled with this a bit when designing it, and ultimately decided to
punt on the issue.  the biggest hangup was that even if hte facet counting
code was smart about making sure the ranges don't overlap, the range query
syntax in the QueryParser doesn't support ranges that exclude one input
(so there wouldn't be a lot you can do with the ranges once you know the
counts in them)

one idea i had in SOLR-258 was that we could add an interval option that
would define how much to add to the end or one range to get the start
of another range (think of the current implementation having interval
hardcoded to 0) which would solve the problem and work with range
queries that were inclusive of both endpoints, but would require people to
use -1MILLI a lot.

a better option (assuming a query parser change) would be a new option
thta says wether each computed range should be enclusive of the low poin,t
the high point, both end points, neither end points, or be smart (where
smart is the same as low except for the last range where the it includes
both)

(I think there's already a lucene issue to add the query parser support, i
just haven't had time to look at it)

The simple workarround: if you know all of your data is indexed with
perfect 0.000second precision, then put -1MILLI at the end of your start
and end date faceting params.


 options for dealing with range endpoints in date facets
 ---

 Key: SOLR-397
 URL: https://issues.apache.org/jira/browse/SOLR-397
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man

 Date faceting should support configuration for controlling how edge 
 boundaries are dealt with.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-397) options for dealing with range endpoints in date facets

2010-02-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834597#action_12834597
 ] 

Hoss Man commented on SOLR-397:
---

Additional idea that i like much better then the interval idea i had a while 
back, transcribed from email so it's not lost to the ages...

I think the semantics that might make the most sense is to add a 
multivalued facet.date.include param that supports the following 
options:  all, lower, upper, edge, outer
 - all is shorthand for lower,upper,edge,outer and is the default (for back 
compat)
 - if lower is specified, then all ranges include their lower bound
 - if upper is specified, then all ranges include their upper bound
 - if edge is specified, then the first and last ranges include their edge 
bounds (ie: lower for the first one, upper for the last one) even if the 
corrisponding upper/lower option is not specified.
 - the between count is inclusive of each of the start and end bounds iff the 
first and last range are inclusive of them
 - the before and after ranges are inclusive of their respective bounds if:
 -* outer is specified ... OR ...
 -* the first and last ranges don't already include them


so assuming you started with something like (specific dates and durrations 
shortend for readability)...

{{facet.date.start=1  facet.date.end=3  facet.date.gap=+1  
facet.date.other=all}}

...your ranges would be...

{{[1 TO 2], [2 TO 3] and [* TO 1], [1 TO 3], [3 TO *]}}


The following params would change the ranges in the following ways...

{code}
w/ facet.date.include=lower ...
  [1 TO 2}, [2 TO 3} and [* TO 1}, [1 TO 3}, [3 TO *]

w/facet.date.include=upper ...
  {1 TO 2], {2 TO 3] and [* TO 1], {1 TO 3], {3 TO *]

w/ facet.date.include=lowerfacet.date.include=edge ...
  [1 TO 2}, [2 TO 3] and [* TO 1}, [1 TO 3], {3 TO *]

w/ facet.date.include=upperfacet.date.include=edge ...
  [1 TO 2], {2 TO 3] and [* TO 1}, [1 TO 3], {3 TO *]

w/ facet.date.include=upperfacet.date.include=outer ...
  {1 TO 2], {2 TO 3] and [* TO 1], {1 TO 3], [3 TO *]

...etc.
{code}

initial proposal: 
http://old.nabble.com/RE%3A-Date-Facet-duplicate-counts-p27331578.html

 options for dealing with range endpoints in date facets
 ---

 Key: SOLR-397
 URL: https://issues.apache.org/jira/browse/SOLR-397
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man

 Date faceting should support configuration for controlling how edge 
 boundaries are dealt with.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1776) dismax should treate schema's default field as a default qf

2010-02-16 Thread Hoss Man (JIRA)
dismax should treate schema's default field as a default qf
---

 Key: SOLR-1776
 URL: https://issues.apache.org/jira/browse/SOLR-1776
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man


the DismaxQParser is completely useless w/o specifying the qf param, but for 
the life of me i can't think of any good reason why it shouldn't use 
IndexSchema.getDefaultSearchFieldName() as teh default value for hte qf param.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1579) CLONE -stats.jsp XML escaping

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1579.


Resolution: Fixed
  Assignee: Hoss Man  (was: Erik Hatcher)


I fully expect stats.jsp will be deprecated in the next release of Solr in 
favor of the handler in SOLR-1750 -- BUT -- I still can't beleive such an 
anoying and yet trivial to fix bug was arround for so long ... especially since 
the incorrect fix for the XML attribute escaping is only half the problem: 
escapeCharData as still needed for the XML ELement content escaping.

David: thanks for your prodding on this ... i committed your patch plus some 
additional fixes (r909705)


 CLONE -stats.jsp XML escaping
 -

 Key: SOLR-1579
 URL: https://issues.apache.org/jira/browse/SOLR-1579
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 1.4
Reporter: David Bowen
Assignee: Hoss Man
 Fix For: 1.5

 Attachments: SOLR-1579.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The fix to SOLR-1008 was wrong.  It used chardata escaping for a value that 
 is an attribute value.
 I.e. instead of XML.escapeCharData it should call XML.escapeAttributeValue.
 Otherwise, any query used as a key in the filter cache whose printed 
 representation contains a double-quote character causes invalid XML to be 
 generated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1008) stats.jsp XML escaping

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1008:
---

Fix Version/s: (was: 1.4)
   1.5

Note: the fix included in Solr 1.4 was not actually correct, revising version 
info accordingly.  see SOLR-1579 for details

 stats.jsp XML escaping
 --

 Key: SOLR-1008
 URL: https://issues.apache.org/jira/browse/SOLR-1008
 Project: Solr
  Issue Type: Bug
  Components: web gui
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 1.5

 Attachments: SOLR-1008.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 stats.jsp gave this error:
 Line Number 1327, Column 48:stat name=item_attrFacet_Size__Shape
 stat names are not XML escaped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1771) StringIndexDocValues should provide a better error message when getStringIndex fails

2010-02-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1771.


Resolution: Fixed
  Assignee: Hoss Man

I'm not convinced that the wording of the new error message is all that great, 
but it's vastly better then the previous behavior...

Committed revision 909746.


Note that this affected numerous different class: OrdFieldSource, All the 
Sortable*Field classes, DateField, and StrField.  (anyone instantiating an 
instance of StringIndexDocValues)

 StringIndexDocValues should provide a better error message when 
 getStringIndex fails
 

 Key: SOLR-1771
 URL: https://issues.apache.org/jira/browse/SOLR-1771
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.5


 if someone attempts to use an OrdFieldSource on a field that is tokenized, 
 FieldCache.getStringIndex throws a confusing RuntimeException that 
 StringIndexDocValues propogates.  we should wrap that exception in something 
 more helpful...
 http://old.nabble.com/sorting-td27544348.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1772) UpdateProcessor to prune empty values

2010-02-12 Thread Hoss Man (JIRA)
UpdateProcessor to prune empty values
---

 Key: SOLR-1772
 URL: https://issues.apache.org/jira/browse/SOLR-1772
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man


Users seem to frequently get confused when some FieldTypes (typically the 
numeric ones) complain about invalid field values when the inadvertantly index 
an empty string.

It would be cool to provide an UpdateProcessor that makes it easy to strip out 
any fields being added as empty values ... it could be configured using field 
(and/or field type) names or globs to select/ignore certain fields -- i haven't 
thought it through all that hard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-534) Return all query results with parameter rows=-1

2010-02-10 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832333#action_12832333
 ] 

Hoss Man commented on SOLR-534:
---

bq. But if you use the REALLY_BIG_NUMBER approach, the same bad programmer who 
never thought he would get back more than a 1000 records will never check 
whether the result set contains more than 1000 records either.

If we're going to assume the programmer doesn't check the actual number found, 
then why assume that the programmer pays attention to anything in the response 
at all? 

If you think it's likley that programmers will write code that only looks at 
the docList to iterates over all the docs in a response and doesn't notice that 
the numFound at the top of the docList is higher then the number asked for. 
then why do you assume that same programmer would be smart enough to check if 
an error message is returned when they ask for all rows and Solr can't 
provide them?

Bottom line: we can't protect programmers from all possible forms of stupidity 
stupidity, but we can make them be explicit about exactly what they want -- if 
they want 100, they ask for 100;  if they want 1 they ask for 1, if 
they want all they have to specify how big they think all is.

bq. Solr sure as heck better be checking this already--you never know when 
you'll run into bizarre low memory conditions;allocations should ALWAYS be 
checked for.

This isn't as easy as it may sound in Java ... the APIS available to test for 
the amount of memory available are limited, and even if hte JVM has the 
resources to allocate a 10,000,000 item PiorityQuery when computing the 
results, that doesn't mean doing so won't eat up all the available RAM causing 
some later (extremely tiny) allocation to trigger an OOM --- but If you've got 
a suggestion to help prevent OOM in situations like this, by all means patches 
welcome. 

 Return all query results with parameter rows=-1
 ---

 Key: SOLR-534
 URL: https://issues.apache.org/jira/browse/SOLR-534
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
 Environment: Tomcat 5.5
Reporter: Lars Kotthoff
Priority: Minor
 Attachments: solr-all-results.patch


 The searcher should return all results matching a query when the parameter 
 rows=-1 is given.
 I know that it is a bad idea to do this in general, but as it explicitly 
 requires a special parameter, people using this feature will be aware of what 
 they are doing. The main use case for this feature is probably debugging, but 
 in some cases one might actually need to retrieve all results because they 
 e.g. are to be merged with results from different sources.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1765) HTTP Caching related headers are incorrect for distributed searches

2010-02-10 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1765:
---

Component/s: (was: multicore)
 (was: search)
Description: 
When searching across multiple shards with HTTP caching enabled, the Caching 
related headers (ETag, Cache-Control, Last-Modified)  in the response are based 
on the index of the coordinating solr core, and are not influenced by the 
properties of the shards. For example, take the query

http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3

ETag should be calculated off of core2 and core3, instead it's being calculated 
from the index of core1.

This results in index modificaitons to to core2 or core3 being invisible to 
clients which query this URL using If-None-Match or If-Modified-Since type 
requests 

  was:
When searching across multiple shards with HTTP caching enabled, the ETag value 
in the response is only using the searcher in the original request, not the 
shards. For example, take the query

http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3

ETag should be calculated off of core2 and core3, instead it's being calculated 
from core1.

Summary: HTTP Caching related headers are incorrect for distributed 
searches  (was: ETag calculation is incorrect for distributed searches)

 HTTP Caching related headers are incorrect for distributed searches
 ---

 Key: SOLR-1765
 URL: https://issues.apache.org/jira/browse/SOLR-1765
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Charlie Jackson
Priority: Minor

 When searching across multiple shards with HTTP caching enabled, the Caching 
 related headers (ETag, Cache-Control, Last-Modified)  in the response are 
 based on the index of the coordinating solr core, and are not influenced by 
 the properties of the shards. For example, take the query
 http://localhost:8983/solr/core1/select/?q=googleshards=localhost:8983/solr/core2,localhost:8983/solr/core3
 ETag should be calculated off of core2 and core3, instead it's being 
 calculated from the index of core1.
 This results in index modificaitons to to core2 or core3 being invisible to 
 clients which query this URL using If-None-Match or If-Modified-Since 
 type requests 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1742) uniqueKey must be string type otherwise missing core name in path error is rendered

2010-02-04 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1742.


Resolution: Duplicate

the string dependency for QueryElevationComponent does seem to be the root of 
the problem -- i think we should try to get the the bottom of why such a 
confusing error message is reported, but we've already got SOLR-1743 to track 
that.

 uniqueKey must be string type otherwise missing core name in path error is 
 rendered
 -

 Key: SOLR-1742
 URL: https://issues.apache.org/jira/browse/SOLR-1742
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
 Fix For: 1.5


 How to replicate:
 - create index with schema where you uniqueKet is integer
 - set your unique key type to integer
 - deploy your index
 under http://host:8080/solr/admin/ -  you will get missing core name in path
 Workaround:
 - change type of your uniqueKet to srting
 - undeploy and deploy index
 Its quite confusing as 1.5 is not properly reporting errors and you need to 
 be lucky to find that reason on your own.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1743) error reporting is rendering 404 missing core name in path for all type of errors

2010-02-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829747#action_12829747
 ] 

Hoss Man commented on SOLR-1743:


We definitely shouldn't be generating a missing core name in path for 
situations like missconfiguration in a single core setup.

In the trunk, things like attempting to load a RequestHandler class that can't 
be found correctly result in a Severe errors in solr configuration. type 
message in the browser, which then shows the stack trace of the problem.

However: something as simple as a typoe like this...

{code}
Index: example/solr/conf/schema.xml
===
--- example/solr/conf/schema.xml(revision 906596)
+++ example/solr/conf/schema.xml(working copy)
@@ -456,7 +456,7 @@
when adding a document.
--
 
-   field name=id type=string indexed=true stored=true required=true 
/ 
+   field name=id type=asdfasdf indexed=true stored=true 
required=true / 
field name=sku type=textTight indexed=true stored=true 
omitNorms=true/
field name=name type=textgen indexed=true stored=true/
field name=alphaNameSort type=alphaOnlySort indexed=true 
stored=false/
{code}

...results in http://localhost:8983/solr/admin/ generating the missing core 
name in path error described, with no other context.

In Solr 1.4, this same type of error would have generated a Severe errors in 
solr configuration. type message (w/ stack trace) so this definitely seems 
like a new bug in IndexSchema config error handling introduced in the trunk 
since Solr 1.4

 error reporting is rendering 404 missing core name in path for all type of 
 errors
 ---

 Key: SOLR-1743
 URL: https://issues.apache.org/jira/browse/SOLR-1743
 Project: Solr
  Issue Type: Bug
  Components: Build
 Environment: all
Reporter: Marcin
 Fix For: 1.5


 despite the error in schema syntax or any other type of error you will always 
 get:
 404 missing core name in path communicate.
 cheers,
 /Marcin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1754) Legacy numeric types do not check input for bad syntax

2010-02-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829769#action_12829769
 ] 

Hoss Man commented on SOLR-1754:


The reason we never explicitly tested the input value was for speed -- if the 
user says it's an int we trust them. The only places any FieldTypes explicitly 
validate the input strings (ie: SortableIntField, DateField, etc..) is when 
they get it free as a side effect of conversion (in DateField's case: even 
though we index the raw string, we have to parse it anyway looking for DateMath)

Is there really any memory efficiency from IntField that can't be achieved with 
an appropriate precisionStep on TrieIntField?

 Legacy numeric types do not check input for bad syntax
 --

 Key: SOLR-1754
 URL: https://issues.apache.org/jira/browse/SOLR-1754
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Lance Norskog
 Fix For: 1.5


 The legacy numeric types do not check their input values for valid input. A 
 text string is accepted as input for any of these types: IntField, LongField, 
 FloatField, DoubleField. DateField checks its input.
 In general this is a no-fix, except: that IntField is a necessary memory type 
 because it cuts memory use in sorting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp

2010-02-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829773#action_12829773
 ] 

Hoss Man commented on SOLR-1750:


bq. Any thoughts on the naming of this beast?

SystemInfoHandler sounds good.

This would probably also be a good time to retire registry.jsp ... all we 
need to do is add a few more pieces of system info to this handler (and add 
some param options to disable the stats part of the output)

bq. Also, food for thought, when (hopefully not if) the VelocityResponseWriter 
is moved into core, we can deprecate stats.jsp and skin the output of this 
request handler for a similar pleasant view like stats.jsp+client-side xsl does 
now.

Even if/when VelocityResponseWRiter is in the core, i'd still rather just rely 
on client side XSLT for this to reduce the number of things that could 
potentially get missconfigured and then confuse people why the page doesn't 
look right ... the XmlResponseWRriter has always supported a stylesheet param 
that (while not generally useful to most people) let's you easily reference any 
style sheet that can be served out of the admin directory ... all we really 
need is an updatd .xsl file to translate the standard XML format into the old 
style stats view.

 SystemStatsRequestHandler - replacement for stats.jsp
 -

 Key: SOLR-1750
 URL: https://issues.apache.org/jira/browse/SOLR-1750
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Trivial
 Fix For: 1.5

 Attachments: SystemStatsRequestHandler.java


 stats.jsp is cool and all, but suffers from escaping issues, and also is not 
 accessible from SolrJ or other standard Solr APIs.
 Here's a request handler that emits everything stats.jsp does.
 For now, it needs to be registered in solrconfig.xml like this:
 {code}
 requestHandler name=/admin/stats 
 class=solr.SystemStatsRequestHandler /
 {code}
 But will register this in AdminHandlers automatically before committing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1750) SystemStatsRequestHandler - replacement for stats.jsp

2010-02-04 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1750:
---

Attachment: SystemStatsRequestHandler.java

Some updates to Erik's previous version...

# adds everything from registry.jsp
#* lucene/solr version info
#* source/docs info for each object
# forcibly disable HTTP Caching
# adds params to control which objects are listed
#* (multivalued) cat param restricts category names (default is all)
#* (multivalued) key param restricts object keys (default is all) 
# adds (boolean) stats param to control if stats are outputed for each object
#* per-field style override can be used to override per object key
# refactored the old nested looping that stast.jsp did over every object and 
every category into a single pass
# switch all HashMaps to NamedLists or SimpleOrderedMaps to preserve 
predictable ordering

Examples...
* {{?cat=CACHE}}
** return info about caches, but nothing else (stats disabled by default)
* {{?stats=truecat=CACHE}}
** return info and stats about caches, but nothing else
* {{?stats=truef.fieldCache.stats=false}}
** Info about everything, stats for everything except fieldCache
* {{?key=fieldCachestats=true}}
** Return info and stats for fieldCache, but nothing else

I left the class name alone, but i vote for SystemInfoRequestHandler with a 
default registration of /admin/info



 SystemStatsRequestHandler - replacement for stats.jsp
 -

 Key: SOLR-1750
 URL: https://issues.apache.org/jira/browse/SOLR-1750
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Trivial
 Fix For: 1.5

 Attachments: SystemStatsRequestHandler.java, 
 SystemStatsRequestHandler.java


 stats.jsp is cool and all, but suffers from escaping issues, and also is not 
 accessible from SolrJ or other standard Solr APIs.
 Here's a request handler that emits everything stats.jsp does.
 For now, it needs to be registered in solrconfig.xml like this:
 {code}
 requestHandler name=/admin/stats 
 class=solr.SystemStatsRequestHandler /
 {code}
 But will register this in AdminHandlers automatically before committing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1754) Legacy numeric types do not check input for bad syntax

2010-02-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12829902#action_12829902
 ] 

Hoss Man commented on SOLR-1754:


The second array you are talking about only exists if you use the StringIndex 
based FieldCache.

The TrieField subclasses all use the raw primitive FieldCache types, they just 
use a special parser to decode the Trie value into the raw primitive value ... 
take a look at o.a.s.schema.TrieField.getSortField.

If you look at stats.jsp you can see which FieldCaches are loaded for each 
field, and verify that all the TreidIntField's you sort on are using a 
primitive int[], and not a StringIndex.

 Legacy numeric types do not check input for bad syntax
 --

 Key: SOLR-1754
 URL: https://issues.apache.org/jira/browse/SOLR-1754
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Lance Norskog
 Fix For: 1.5


 The legacy numeric types do not check their input values for valid input. A 
 text string is accepted as input for any of these types: IntField, LongField, 
 FloatField, DoubleField. DateField checks its input.
 In general this is a no-fix, except: that IntField is a necessary memory type 
 because it cuts memory use in sorting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory

2010-02-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828835#action_12828835
 ] 

Hoss Man commented on SOLR-1677:


bq. I guess I could care less what the default is, if you care about such 
things you shouldn't be using the defaults and instead specifying this yourself 
in the schema, and Version has no effect.

...which is all well and good, but it just re-iterates the need for really good 
documentation about what is impacted by changing a global Version setting -- 
otherwise users might be depending on a default behavior that is going to 
change when Version as bumped, and they may not even realize it.

Bear in mind: these are just the nuances that people need to worry about when 
considering a switch from 2.4 to 2.9 to 3.0 ... there will likely be a lot more 
of these over time.

And just to be as crystal clear as i possibly can:
* my concern is purely about how to document this stuff.
* i do in fact agree that a global luceneVersionMatch option is a good idea

 Add support for o.a.lucene.util.Version for BaseTokenizerFactory and 
 BaseTokenFilterFactory
 ---

 Key: SOLR-1677
 URL: https://issues.apache.org/jira/browse/SOLR-1677
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Reporter: Uwe Schindler
 Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, 
 SOLR-1677.patch


 Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards 
 compatibility with old indexes created using older versions of Lucene. The 
 most important example is StandardTokenizer, which changed its behaviour with 
 posIncr and incorrect host token types in 2.4 and also in 2.9.
 In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with 
 much more Unicode support, almost every Tokenizer/TokenFilter needs this 
 Version parameter. In 2.9, the deprecated old ctors without Version take 
 LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer.
 This patch adds basic support for the Lucene Version property to the base 
 factories. Subclasses then can use the luceneMatchVersion decoded enum (in 
 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently 
 contains a helper map to decode the version strings, but in 3.0 is can be 
 replaced by Version.valueOf(String), as the Version is a subclass of Java5 
 enums. The default value is Version.LUCENE_24 (as this is the default for the 
 no-version ctors in Lucene).
 This patch also removes unneeded conversions to CharArraySet from 
 StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed 
 to match Lucene 3.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1718) Carriage return should submit query admin form

2010-02-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828842#action_12828842
 ] 

Hoss Man commented on SOLR-1718:


bq. Consider the JIRA interface we are using to comment on this issue. 

Sure, but that's an {{input type=text /}}, not a {{textarea /}} ... the 
expected semantics are completely different.  With a {{input type=text /}} 
box the browser already takes care of submitting the form if you hit Enter (and 
FWIW: most browsers i know of also submit forms if you use Shift-Enter in a 
{{textarea /}})

It sounds like what you are really suggesting is that we change the 
/admin/index.jsp form to use a {{input type=text /}} instead of a 
{{textarea /}} for the q param, and not that we add special (javascript) 
logic to the form to submit if someone presses Enter inside the existing 
{{textarea /}}  ... which i have a lot less objection to then going out of 
our way to violate standard form convention.

 Carriage return should submit query admin form
 --

 Key: SOLR-1718
 URL: https://issues.apache.org/jira/browse/SOLR-1718
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 Hitting the carriage return on the keyboard should submit the search query on 
 the admin front screen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-02-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828846#action_12828846
 ] 

Hoss Man commented on SOLR-1729:


Peter: I think you may have misconstrued my comments -- they were not 
criticisms of your patch, they were a clarification of why the functionality 
you are proposing is important.

bq. Can you point me toward the class(es) where filter queries' date math lives

it's all handled internally by DateField, at which point it has no notion of 
the request -- I believe this is why yonik suggested using a ThreadLocal 
variable to track a consistent NOW that any method anywhere in Solr could use 
(if set) for the current request ... then we just need something like SolrCore 
to set it on each request (or accept it as a parm if specified)

bq. As filter queries are cached separately, can you think of any potential 
caching issues relating to filter queries?

The cache keys for things like that are the Query objects themselves, and at 
that point the DateMath strings (including NOW) have already been resolved 
into realy time values so that shouldn't be an issue.


 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1749) debug output should include explanation of what input strings were passed to the analzyers for each field

2010-02-02 Thread Hoss Man (JIRA)
debug output should include explanation of what input strings were passed to 
the analzyers for each field
-

 Key: SOLR-1749
 URL: https://issues.apache.org/jira/browse/SOLR-1749
 Project: Solr
  Issue Type: Wish
  Components: search
Reporter: Hoss Man


Users are frequently confused by the interplay between Query Parsing and 
Query Time Analysis (ie: markup meta-characters like whitespace and quotes, 
multi-word synonyms, Shingles, etc...)  It would be nice if we had more 
debugging output available that would help eliminate this confusion.  The ideal 
API that comes to mind would be to include in the debug output of SearchHandler 
a list of every string that was Analyzed, and what list of field names it was 
analyzed against.  

This info would not only make it clear to users what exactly they should 
cut/paste into the analysis.jsp tool to see how their Analyzer is getting used, 
but also what exactly is being done to their input strings prior to their 
Analyzer being used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1749) debug output should include explanation of what input strings were passed to the analzyers for each field

2010-02-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1282#action_1282
 ] 

Hoss Man commented on SOLR-1749:


This is an idea that's been rolling arround in my head for a while, and today I 
thought i'd spend some time experimenting with it.

It seemed like the main impelmentation challenge would be that by the time you 
are deep enough down in the code to be using an Analyzer, you don't have access 
to the SolrQueryRequest to record the debugging info.

I thought of two potential solutions...

 * Use ThreadLocal to track the debugging info if needed
 * Use Proxy Wrapper classes to record the debugging info if needed

I initially figured that writing proxy classes for SolrQueryRequest, 
IndexSchema, and Analyzer would be relatively straight forward, so i started 
down that path and discovered two anoying problems...

 # IndexSchema is currently final
 # not all code paths use IndexSchema.getQueryAnalyzer(), many fetch the 
FieldTypes and ask them for their Analyzer directly.

The second problem isn't insurmountable, but it complicates things in that it 
would require Proxy wrappers for FieldType as well.  The first problem requires 
a simple change, but carries with it some baggage that i wasn't ready to 
embrace.  In both cases i started to be very bothered by the long term 
maintenance something like this would introduce.  It would be very easy to 
write these Proxy classes that extend IndexSchema, FieldType, and Analyzer but 
it would be just as easy to forget to add the appropriate Proxy methods to them 
down the road when new methods are added to those base classes.

The issue with the FieldType also exposed a flaw in the idea of using 
ThreadLocal: if we only had to worry about IndexSearcher.getQueryAnalyzer(), we 
could modify it to check ThreadLocal easily enough, but at the FieldType level 
we would only be able to modify FieldTypes that ship with Solr, and we'd be 
missing any plugin FieldTypes,


So i aborted the experiment but i figured i should post the feature idea, and 
my existing thoughts, here in case anyone had other suggestions on how it could 
be implemented feasibly.

 debug output should include explanation of what input strings were passed to 
 the analzyers for each field
 -

 Key: SOLR-1749
 URL: https://issues.apache.org/jira/browse/SOLR-1749
 Project: Solr
  Issue Type: Wish
  Components: search
Reporter: Hoss Man

 Users are frequently confused by the interplay between Query Parsing and 
 Query Time Analysis (ie: markup meta-characters like whitespace and quotes, 
 multi-word synonyms, Shingles, etc...)  It would be nice if we had more 
 debugging output available that would help eliminate this confusion.  The 
 ideal API that comes to mind would be to include in the debug output of 
 SearchHandler a list of every string that was Analyzed, and what list of 
 field names it was analyzed against.  
 This info would not only make it clear to users what exactly they should 
 cut/paste into the analysis.jsp tool to see how their Analyzer is getting 
 used, but also what exactly is being done to their input strings prior to 
 their Analyzer being used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1739) index of facet fields are not same as original string in record

2010-01-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1739.


Resolution: Not A Problem

w/o knowing more details about your schema, this seems to be working exactly as 
expected.

If you have a question about behavior you are seeing from solr, and are not 
100% certain that it is a bug (bug == not functioning as documented) then you 
should post a question to the solr-user mailing list before opening a Jira 
issue.

In a nutshell: faceting works based on the indexed values; if you need/want 
different constraints to be displayed for a facet field, then you should use a 
different analyzer.

 index of facet fields are not same as original string in record
 ---

 Key: SOLR-1739
 URL: https://issues.apache.org/jira/browse/SOLR-1739
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
 Environment: Solr search engine is deployed in tomcat and running in 
 windows OS.
Reporter: Uma Maheswari

 Hi, 
  I am new to Solr. I found facets fields does not reflect the original string 
 in the record. For example, 
 the returned xml is, 
 - doc 
   str name=g_numberG-EUPE/str 
 /doc 
 - lst name=facet_counts 
   lst name=facet_queries / 
 - lst name=facet_fields 
 - lst name=g_number 
   int name=gupe1/int 
 /lst 
   /lst 
 -  lst name=facet_dates / 
   /lst 
 Here, G-EUPE is displayed under facet field as 'gupe' where it is not 
 capital and missing '-' from the original string. Is there any way we could 
 fix this to match the original text in record? Thanks in advance. 
 Regards, 
 uma

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1603) Perl Response Writer

2010-01-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12806616#action_12806616
 ] 

Hoss Man commented on SOLR-1603:



{quote}
the output is a complex Perl data structure with search results which would 
presumably immediately be assigned to a variable - not eval'd.
Absolutely agree with Erik and Yonik - I can't think of a realistic case in 
which this would present a security risk.
{quote}

The only way (i know of) to utilize a string based representation of a data 
structure like this in perl is using eval to convert it from a string 
representation to the intended data structures...

bq. I'm aware of the risk of eval'ing untrusted strings, but I'm not sure how 
this could be a problem with a Solr response.

...The issue is that If you have a network service whose output format is only 
useful when evaled by the client, then even if that service only ever 
produces serialized data (and not serialized code) it still opens the client up 
to man in the middle attacks where a malicious server can generate a response 
that _does_ include malicious code, and that code is executed by the client ... 
man in the middle attacks of something like XML that provide tainted data are 
bad enough, but the possibility of tainted code is really sketchy.

As i said before: i'm not making any statements about this patch being 
more/less safe then any of the other existing response writers that are only 
useful when evaled in a particular language interpreter -- my point was that 
while I have never had any clear notion about how/when evaling strings from an 
external source was considered acceptable in those language communities (the 
example of python's literal_eval is a good one), I _am_ a heavy perl user, and 
i do know that the Perl community as a whole actively discourages using eval to 
deserialize perl from remote services -- this is precisely why things like 
YAML and the Storable API were created.  Both have options to control how they 
should behave if/when code is encountered in the serialized data.

I can see value in adding an output format designed to be trivially useful for 
perl, but i don't feel comfortable advertising something for Perl users that 
directly violates Perl best practices -- Particularly when we already have two 
writers that are fairly easy to use from perl anyway (XML and JSON)




 Perl Response Writer
 

 Key: SOLR-1603
 URL: https://issues.apache.org/jira/browse/SOLR-1603
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Reporter: Claudio Valente
Priority: Minor
 Attachments: SOLR-1603.2.patch, SOLR-1603.patch


 I've made a patch that implements a Perl response writer for Solr.
 It's nan/inf and unicode aware.
 I don't know whether some fields can be binary but if so I can probably 
 extend it to support that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805167#action_12805167
 ] 

Hoss Man commented on SOLR-1677:


bq. And here are the JIRA issues for stemming bugs, since you didnt take my 
hint to go and actually read them.

sigh.  I read both those issues when you filed them, and I agreed with your 
assessment that they are bugs we should fix -- if i had thought you were wrong 
i would have said so in the issue comments.

But that doesn't change the fact that sometimes people depend on buggy behavior 
-- and sometimes those people depend on the buggy behavior without even 
realizing it.  Bug fixes in a stemmer might make it more correct according to 
the stemmer algorithm specification, or the language semantics, but in some 
peculuar use cases an application might find the correct implementation less 
useful then the previous buggy version.

This is one reason why things like CHANGES.txt are important: to draw attention 
to what has changed between two versions of a piece of software, so people can 
make informed opinions about what they should test in their own applications 
when they upgrade things under the covers.  luceneMatchVersion should be no 
different.  We should try to find a simple way to inform people when you 
switch from luceneMatchVersion=X to luceneMatchVersion=Y here are the bug fixes 
you will get so they know what to test to determine if they are adversely 
affected by that bug fix in some way (and find their own work around)

bq. Perhaps you should come up with a better example than stemming, as you 
don't know what you are talking about.

1) It's true, I frequently don't know what i'm talking about ... this issue was 
a prime example, and i thank you, Uwe, and Miller for helping me realize that i 
was completely wrong in my understanding about the intended purpose of 
o.a.l.Version, and that a global setting for it in Solr makes total sense -- 
But that doesn't make my concerns about documenting the affects of that global 
setting any less valid.

2) Perhaps you should read the StopFilter example i already posted in my last 
comment...

{quote}
bq. Robert mentioned in an earlier comment that StopFilter's position increment 
behavior changes depending on the luceneMatchVersion -- what if an existing 
Solr 1.3 user notices a bug in some Tokenizer, and adds 
{{luceneMatchVersion3.0/luceneMatchVersion}} to his schema.xml to fix it.  
Without clear documentation n _everything_ that is affected when doing that, he 
may not realize that StopFilter changed at all -- and even though the position 
incrememnt behavior may now be more correct, it might drasticly change the 
results he gets when using dismax with a particular qs or ps value.  Hence my 
point that this becomes a serious documentation concern: finding a way to make 
it clear to users what they need to consider when modifying luceneMatchVersion.
{quote}

 Add support for o.a.lucene.util.Version for BaseTokenizerFactory and 
 BaseTokenFilterFactory
 ---

 Key: SOLR-1677
 URL: https://issues.apache.org/jira/browse/SOLR-1677
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Reporter: Uwe Schindler
 Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, 
 SOLR-1677.patch


 Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards 
 compatibility with old indexes created using older versions of Lucene. The 
 most important example is StandardTokenizer, which changed its behaviour with 
 posIncr and incorrect host token types in 2.4 and also in 2.9.
 In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with 
 much more Unicode support, almost every Tokenizer/TokenFilter needs this 
 Version parameter. In 2.9, the deprecated old ctors without Version take 
 LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer.
 This patch adds basic support for the Lucene Version property to the base 
 factories. Subclasses then can use the luceneMatchVersion decoded enum (in 
 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently 
 contains a helper map to decode the version strings, but in 3.0 is can be 
 replaced by Version.valueOf(String), as the Version is a subclass of Java5 
 enums. The default value is Version.LUCENE_24 (as this is the default for the 
 no-version ctors in Lucene).
 This patch also removes unneeded conversions to CharArraySet from 
 StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed 
 to match Lucene 3.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1603) Perl Response Writer

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805186#action_12805186
 ] 

Hoss Man commented on SOLR-1603:


I realize this is analogous to the python, php, and ruby writers, but while i 
can't speak much to how those (language) communities feel about evaling code 
from remote sources to generate data structures, i know that the majority of 
the Perl community considers that a bad practice ... it's the reason things 
like YAML was created: to allow simple serialization w/o needing to execute 
untrusted code.

So i'm a little leery about adding this (beyond my general leeryness of adding 
code w/o tests).

 Perl Response Writer
 

 Key: SOLR-1603
 URL: https://issues.apache.org/jira/browse/SOLR-1603
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Reporter: Claudio Valente
Priority: Minor
 Attachments: SOLR-1603.patch


 I've made a patch that implements a Perl response writer for Solr.
 It's nan/inf and unicode aware.
 I don't know whether some fields can be binary but if so I can probably 
 extend it to support that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1718) Carriage return should submit query admin form

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805190#action_12805190
 ] 

Hoss Man commented on SOLR-1718:


I don't understand what you mean.  both forms use a {{textarea}}, why should 
the behavior of one textarea be different from the behavior of the other (and 
every other html textarea on the web) ?

 Carriage return should submit query admin form
 --

 Key: SOLR-1718
 URL: https://issues.apache.org/jira/browse/SOLR-1718
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 Hitting the carriage return on the keyboard should submit the search query on 
 the admin front screen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805247#action_12805247
 ] 

Hoss Man commented on SOLR-1725:


Some random comments/questions from the peanut gallery...

1) what is the value add in making ScriptUpdateProcessorFactory support 
multiple scripts ? ... wouldn't it be simpler to require that users declare 
multiple instances of ScriptUpdateProcessorFactory (that hte processor chain 
already executes in sequence) then to add sequential processing to the 
ScriptUpdateProcessor?

2) The NamedList init args can be as deep of a data structure as you want, so 
something like this would be totally feasible (if desired) ...

{code}
processor class=solr.ScriptUpdateProcessorFactory
  lst name=scripts
lst name=updateProcessor1.js
  bool name=someParamNametrue/bool
  int name=someOtherParamName3/int
/lst
lst name=updateProcessor2.js
  bool name=fooParamtrue/bool
  str name=barParam3/str
/lst
  /lst
  lst name=otherProcessorOPtionsIfNeeded
...
  /lst
/processor
{code}

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1728) ResponseWriters should support byte[], ByteBuffer

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805291#action_12805291
 ] 

Hoss Man commented on SOLR-1728:


Noble: your issue description is a bit terse, so i'm a little confused.

Are you suggesting an API change such that binary write methods are added to 
QueryResponseWriter (making it equivalent to BinaryQueryResponseWriter) ?  

Or are you suggesting that the existing classes which implement 
QueryResponseWriter ( JSONResponseWriter, PHPResponseWriter, 
PythonResponseWriter, XMLResponseWriter,  etc...) should start implementing 
BinaryQueryResponseWriter?

In either case: what's the motivation?

 ResponseWriters should support byte[], ByteBuffer
 -

 Key: SOLR-1728
 URL: https://issues.apache.org/jira/browse/SOLR-1728
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5


 Only BinaryResponseWriter supports byte[] and ByteBuffer. Other writers also 
 should support these

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1729) Date Facet now override time parameter

2010-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805313#action_12805313
 ] 

Hoss Man commented on SOLR-1729:


bq. (e.g. they are in a different time-zone, not time-synced etc.).

time-zones should be irrelevant since all calculations are done in UTC ... lack 
of time-sync is a legitimate concern, but the more serious problem is 
distributed requests and network lag.  Even if all of the boxes have 
synchronized clocks, they might not all get queried at the exact same time, and 
multiple requets might be made to a single server for different phrases of the 
distributed request that expect to get the same answers.

It should be noted that while adding support to date faceting for this type of 
when is now? is certainly _necessary_ to make distributed date faceting work 
sanely, it is not _sufficient_ ... unless filter queries that use date math 
also respect it the counts returned from date faceting will still potentially 
be non-sensical.

 Date Facet now override time parameter
 --

 Key: SOLR-1729
 URL: https://issues.apache.org/jira/browse/SOLR-1729
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Solr 1.4
Reporter: Peter Sturge
Priority: Minor
 Attachments: FacetParams.java, SimpleFacets.java


 This PATCH introduces a new query parameter that tells a (typically, but not 
 necessarily) remote server what time to use as 'NOW' when calculating date 
 facets for a query (and, for the moment, date facets *only*) - overriding the 
 default behaviour of using the local server's current time.
 This gets 'round a problem whereby an explicit time range is specified in a 
 query (e.g. timestamp:[then0 TO then1]), and date facets are required for the 
 given time range (in fact, any explicit time range). 
 Because DateMathParser performs all its calculations from 'NOW', remote 
 callers have to work out how long ago 'then0' and 'then1' are from 'now', and 
 use the relative-to-now values in the facet.date.xxx parameters. If a remote 
 server has a different opinion of NOW compared to the caller, the results 
 will be skewed (e.g. they are in a different time-zone, not time-synced etc.).
 This becomes particularly salient when performing distributed date faceting 
 (see SOLR-1709), where multiple shards may all be running with different 
 times, and the faceting needs to be aligned.
 The new parameter is called 'facet.date.now', and takes as a parameter a 
 (stringified) long that is the number of milliseconds from the epoch (1 Jan 
 1970 00:00) - i.e. the returned value from a System.currentTimeMillis() call. 
 This was chosen over a formatted date to delineate it from a 'searchable' 
 time and to avoid superfluous date parsing. This makes the value generally a 
 programatically-set value, but as that is where the use-case is for this type 
 of parameter, this should be ok.
 NOTE: This parameter affects date facet timing only. If there are other areas 
 of a query that rely on 'NOW', these will not interpret this value. This is a 
 broader issue about setting a 'query-global' NOW that all parts of query 
 analysis can share.
 Source files affected:
 FacetParams.java   (holds the new constant FACET_DATE_NOW)
 SimpleFacets.java  getFacetDateCounts() NOW parameter modified
 This PATCH is mildly related to SOLR-1709 (Distributed Date Faceting), but as 
 it's a general change for date faceting, it was deemed deserving of its own 
 patch. I will be updating SOLR-1709 in due course to include the use of this 
 new parameter, after some rfc acceptance.
 A possible enhancement to this is to detect facet.date fields, look for and 
 match these fields in queries (if they exist), and potentially determine 
 automatically the required time skew, if any. There are a whole host of 
 reasons why this could be problematic to implement, so an explicit 
 facet.date.now parameter is the safest route.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   3   4   5   6   7   8   9   10   >