[jira] Closed: (SOLR-1879) Error loading class 'Solr.ASCIIFoldingFilterFactory'
[ https://issues.apache.org/jira/browse/SOLR-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi closed SOLR-1879. Resolution: Not A Problem Adlene, please use solr-user mailing list for getting help. http://lucene.apache.org/solr/mailing_lists.html > Error loading class 'Solr.ASCIIFoldingFilterFactory' > > > Key: SOLR-1879 > URL: https://issues.apache.org/jira/browse/SOLR-1879 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 > Environment: Windows XP, Apache Tomcat 6 >Reporter: adlene sifi > > I am trying to use Solr.ASCIIFoldingFilterFactory filter as follow : > > > > ignoreCase="true" > words="french_stopwords.txt" > enablePositionIncrements="true" > /> > > > >mapping="mapping-ISOLatin1Accent.txt"/> > > ... > > However I receive the following error message when restarting Apach Tomcat > server : > GRAVE: org.apache.solr.common.SolrException: Error loading class > 'Solr.ASCIIFoldingFilterFactory' > at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) > at > org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:388) > . > Caused by: java.lang.ClassNotFoundException: Solr.ASCIIFoldingFilterFactory > at java.net.URLClassLoader$1.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > ... 40 more > Could you please help me on that ? > Thanks a lot > Adlene -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-1878) RelaxQueryComponent - A new SearchComponent that relaxes the main query in a semiautomatic way
[ https://issues.apache.org/jira/browse/SOLR-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1878: - Summary: RelaxQueryComponent - A new SearchComponent that relaxes the main query in a semiautomatic way (was: RelaxQueryComponent - A new SearchComponent that relaxes the main in a semiautomatic way) Description: I have the following use case: Imagine that you visit a web page for searching an apartment for rent. You choose parameters, usually mark check boxes and this makes AND queries: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} If the conditions are too tight, Solr may return few or zero leasehold properties. Because the things is not good for the site visitors and also owners, the owner may want to recommend the visitors to relax the conditions something like: {code} rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} or: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *] {code} And if the relaxed query get more numFound than original, the web page can provide a link with a comment "if you can pay additional $100, ${numFound} properties will be found!". Today, I need to implement Solr client for this scenario, but this way makes two round trips for showing one page and consistency problem (and laborious of course!). I'm thinking a new SearchComponent that can be used with QueryComponent. It does search when numFound of the main query is less than a threshold. Clients can specify via request parameters how the query can be relaxed. was: I have the following use case: Imagine that you visit a web page for searching an apartment for rent. You choose parameters, usually mark check boxes and this makes AND queries: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} If the conditions are too tight, Solr may return few or zero leasehold properties. Because the things is not good for the site visitors and also owners, the owner may want to recommend the visitors to relax the conditions something like: {code} rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} or: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *] {code} And if the relaxed query get more numFound than original, the web page can provide a link with a comment "if you can pay additional $100, ${numFound} properties will be found!". Today, I need to implement client for this scenario, but this way makes two round trips for showing one page and consistency problem (and laborious of course!). I'm thinking a new SearchComponent that can be used with QueryComponent. It does search when numFound of the main query is less than a threshold. Clients can specify via request parameters how the query can be relaxed. > RelaxQueryComponent - A new SearchComponent that relaxes the main query in a > semiautomatic way > -- > > Key: SOLR-1878 > URL: https://issues.apache.org/jira/browse/SOLR-1878 > Project: Solr > Issue Type: New Feature > Components: SearchComponents - other >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > > I have the following use case: > Imagine that you visit a web page for searching an apartment for rent. You > choose parameters, usually mark check boxes and this makes AND queries: > {code} > rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *] > {code} > If the conditions are too tight, Solr may return few or zero leasehold > properties. Because the things is not good for the site visitors and also > owners, the owner may want to recommend the visitors to relax the conditions > something like: > {code} > rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *] > {code} > or: > {code} > rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *] > {code} > And if the relaxed query get more numFound than original, the web page can > provide a link with a comment "if you can pay additional $100, ${numFound} > properties will be found!". > Today, I need to implement Solr client for this scenario, but this way makes > two round trips for showing one page and consistency problem (and laborious > of course!). > I'm thinking a new SearchComponent that can be used with QueryComponent. It > does search when numFound of the main query is less than a threshold. Clients > can specify via request parameters how the query can be relaxed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-1878) RelaxQueryComponent - A new SearchComponent that relaxes the main in a semiautomatic way
RelaxQueryComponent - A new SearchComponent that relaxes the main in a semiautomatic way Key: SOLR-1878 URL: https://issues.apache.org/jira/browse/SOLR-1878 Project: Solr Issue Type: New Feature Components: SearchComponents - other Affects Versions: 1.4 Reporter: Koji Sekiguchi Priority: Minor I have the following use case: Imagine that you visit a web page for searching an apartment for rent. You choose parameters, usually mark check boxes and this makes AND queries: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} If the conditions are too tight, Solr may return few or zero leasehold properties. Because the things is not good for the site visitors and also owners, the owner may want to recommend the visitors to relax the conditions something like: {code} rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *] {code} or: {code} rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *] {code} And if the relaxed query get more numFound than original, the web page can provide a link with a comment "if you can pay additional $100, ${numFound} properties will be found!". Today, I need to implement client for this scenario, but this way makes two round trips for showing one page and consistency problem (and laborious of course!). I'm thinking a new SearchComponent that can be used with QueryComponent. It does search when numFound of the main query is less than a threshold. Clients can specify via request parameters how the query can be relaxed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-860) moreLikeThis Degug
[ https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-860: Component/s: (was: search) SearchComponents - other Priority: Minor (was: Major) Fix Version/s: (was: 1.5) 3.1 > moreLikeThis Degug > -- > > Key: SOLR-860 > URL: https://issues.apache.org/jira/browse/SOLR-860 > Project: Solr > Issue Type: New Feature > Components: SearchComponents - other >Affects Versions: 1.3 > Environment: Gentoo Linux, Solr 1.4, tomcat webserver >Reporter: Jeff >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1 > > Attachments: SOLR-860.patch > > > moreLikeThis searchcomponent currently has no way to debug or see information > on the process. This means that if moreLikeThis suggests another document > there is no way to actually view why it picked that to hone the searching. > Adding an explain would be extremely useful in determining the reasons why > solr is recommending the items. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-860) moreLikeThis Degug
[ https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-860: Attachment: SOLR-860.patch With the attached patch, BooleanQueries constructed by MLT and MLT helper function can be seen in debug area. sample request and response: {code} http://localhost:8983/solr/select/?q=solr+ipod&indent=on&mlt=on&mlt.fl=features&mlt.mintf=1&mlt.count=2&debugQuery=on&wt=json {code} {code} "debug":{ "moreLikeThis":{ "IW-02":{ "rawMLTQuery":"", "boostedMLTQuery":"", "realMLTQuery":"+() -id:IW-02"}, "SOLR1000":{ "rawMLTQuery":"", "boostedMLTQuery":"", "realMLTQuery":"+() -id:SOLR1000"}, "F8V7067-APL-KIT":{ "rawMLTQuery":"", "boostedMLTQuery":"", "realMLTQuery":"+() -id:F8V7067-APL-KIT"}, "MA147LL/A":{ "rawMLTQuery":"features:2 features:0 features:lcd features:x features:3", "boostedMLTQuery":"features:2 features:0 features:lcd features:x features:3", "realMLTQuery":"+(features:2 features:0 features:lcd features:x features:3) -id:MA147LL/A"}}, } {code} > moreLikeThis Degug > -- > > Key: SOLR-860 > URL: https://issues.apache.org/jira/browse/SOLR-860 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 > Environment: Gentoo Linux, Solr 1.4, tomcat webserver >Reporter: Jeff >Assignee: Koji Sekiguchi > Fix For: 1.5 > > Attachments: SOLR-860.patch > > > moreLikeThis searchcomponent currently has no way to debug or see information > on the process. This means that if moreLikeThis suggests another document > there is no way to actually view why it picked that to hone the searching. > Adding an explain would be extremely useful in determining the reasons why > solr is recommending the items. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-860) moreLikeThis Degug
[ https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851072#action_12851072 ] Koji Sekiguchi commented on SOLR-860: - At minimum, I'd like to see how the BooleanQuery constructed by mlt look like. Can ResponseBuilder.addDebugInfo() be used for it? > moreLikeThis Degug > -- > > Key: SOLR-860 > URL: https://issues.apache.org/jira/browse/SOLR-860 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 > Environment: Gentoo Linux, Solr 1.4, tomcat webserver >Reporter: Jeff >Assignee: Koji Sekiguchi > Fix For: 1.5 > > > moreLikeThis searchcomponent currently has no way to debug or see information > on the process. This means that if moreLikeThis suggests another document > there is no way to actually view why it picked that to hone the searching. > Adding an explain would be extremely useful in determining the reasons why > solr is recommending the items. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-860) moreLikeThis Degug
[ https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-860: --- Assignee: Koji Sekiguchi > moreLikeThis Degug > -- > > Key: SOLR-860 > URL: https://issues.apache.org/jira/browse/SOLR-860 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 > Environment: Gentoo Linux, Solr 1.4, tomcat webserver >Reporter: Jeff >Assignee: Koji Sekiguchi > Fix For: 1.5 > > > moreLikeThis searchcomponent currently has no way to debug or see information > on the process. This means that if moreLikeThis suggests another document > there is no way to actually view why it picked that to hone the searching. > Adding an explain would be extremely useful in determining the reasons why > solr is recommending the items. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1703) Sorting by function problems on multicore (more than one core)
[ https://issues.apache.org/jira/browse/SOLR-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1703: - Description: When using sort by function (for example dist function) with multicore with more than one core (on multicore with one core, ie. the example deployment the problem doesn`t exist) there is a problem with not using the right schema. I think there is a problem with this portion of code: QueryParsing.java: {code} public static FunctionQuery parseFunction(String func, IndexSchema schema) throws ParseException { SolrCore core = SolrCore.getSolrCore(); return (FunctionQuery) (QParser.getParser(func, "func", new LocalSolrQueryRequest(core, new HashMap())).parse()); // return new FunctionQuery(parseValSource(new StrParser(func), schema)); } {code} Code above uses deprecated method to get the core sometimes getting the wrong core effecting in impossibility to find the right fields in index. was: When using sort by function (for example dist function) with multicore with more than one core (on multicore with one core, ie. the example deployment the problem doesn`t exist) there is a problem with not using the right schema. I think there is a problem with this portion of code: QueryParsing.java: public static FunctionQuery parseFunction(String func, IndexSchema schema) throws ParseException { SolrCore core = SolrCore.getSolrCore(); return (FunctionQuery) (QParser.getParser(func, "func", new LocalSolrQueryRequest(core, new HashMap())).parse()); // return new FunctionQuery(parseValSource(new StrParser(func), schema)); } Code above uses deprecated method to get the core sometimes getting the wrong core effecting in impossibility to find the right fields in index. > Sorting by function problems on multicore (more than one core) > -- > > Key: SOLR-1703 > URL: https://issues.apache.org/jira/browse/SOLR-1703 > Project: Solr > Issue Type: Bug > Components: multicore, search >Affects Versions: 1.5 > Environment: Linux (debian, ubuntu), 64bits >Reporter: Rafał Kuć > > When using sort by function (for example dist function) with multicore with > more than one core (on multicore with one core, ie. the example deployment > the problem doesn`t exist) there is a problem with not using the right > schema. I think there is a problem with this portion of code: > QueryParsing.java: > {code} > public static FunctionQuery parseFunction(String func, IndexSchema schema) > throws ParseException { > SolrCore core = SolrCore.getSolrCore(); > return (FunctionQuery) (QParser.getParser(func, "func", new > LocalSolrQueryRequest(core, new HashMap())).parse()); > // return new FunctionQuery(parseValSource(new StrParser(func), schema)); > } > {code} > Code above uses deprecated method to get the core sometimes getting the wrong > core effecting in impossibility to find the right fields in index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839879#action_12839879 ] Koji Sekiguchi commented on SOLR-1268: -- bq. When using Dismax, the fast vector highlighter fails to return any highlighting when there is more than one column in qf (eg. "qf=Name Company")... Right. See https://issues.apache.org/jira/browse/LUCENE-2243 . > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, > SOLR-1268.patch, SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1297) Enable sorting by Function Query
[ https://issues.apache.org/jira/browse/SOLR-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1297: - Attachment: SOLR-1297-2.patch When I set *bit* complex function to sort parameter, I got the error: {panel} Must declare sort field or function org.apache.solr.common.SolrException: Must declare sort field or function at org.apache.solr.search.QueryParsing.processSort(QueryParsing.java:376) at org.apache.solr.search.QueryParsing.parseSort(QueryParsing.java:281) at org.apache.solr.search.QueryParsingTest.testSort(QueryParsingTest.java:105) {panel} Attached the fix and the test case. > Enable sorting by Function Query > > > Key: SOLR-1297 > URL: https://issues.apache.org/jira/browse/SOLR-1297 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1297-2.patch, SOLR-1297.patch > > > It would be nice if one could sort by FunctionQuery. See also SOLR-773, > where this was first mentioned by Yonik as part of the generic solution to > geo-search -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833527#action_12833527 ] Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 8:19 AM: --- Oops, I've glanced at SOLR-236 related issues, but I thought it was for finalize response format from Description. I'll look into SOLR-1682. Thanks! :) was (Author: koji): Oops, I've glanced at SOLR-236 related issues, but I wasn't awake to the existence. I'll look into SOLR-1682. Thanks! :) > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: LOADTEST.patch, SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833527#action_12833527 ] Koji Sekiguchi commented on SOLR-1773: -- Oops, I've glanced at SOLR-236 related issues, but I wasn't awake to the existence. I'll look into SOLR-1682. Thanks! :) > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: LOADTEST.patch, SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495 ] Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 4:54 AM: --- Random comment on the patch: - TimeAllowed not supported - cache not supported - distributed search is not supported - sort field is hard-coded in the patch - collapse.type=adjacent is not supported - collapse.aggregate is not supported (but supportable) - not yet, but collapse.sort can be supported to specify sort criteria in collapse group supported parameters: |collapse|set to on to use field collapsing| |collapse.field|field name to collapse (required)| |collapse.limit|maximum number of collapsed docs to return in each collapse group. default is 0.| |collapse.fl|comma- or space- delimited list of fields to return. multiValued field and TrieField are not supported yet| was (Author: koji): Random comment on the patch: - TimeAllowed not supported - cache not supported - distributed search is not supported - sort field is hard-coded in the patch - collapse.type=adjacent is not supported - collapse.aggregate is not supported (but supportable) - not yet, but collapse.sort can be supported to specify sort criteria in collapse group supported parameters: |collapse|set to on to use field collapsing| |collapse.field|field name to collapse (required)| |collapse.limit|maximum number of collapsed docs to return in each collapse group| |collapse.fl|comma- or space- delimited list of fields to return| > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: LOADTEST.patch, SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495 ] Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 4:51 AM: --- Random comment on the patch: - TimeAllowed not supported - cache not supported - distributed search is not supported - sort field is hard-coded in the patch - collapse.type=adjacent is not supported - collapse.aggregate is not supported (but supportable) - not yet, but collapse.sort can be supported to specify sort criteria in collapse group supported parameters: |collapse|set to on to use field collapsing| |collapse.field|field name to collapse (required)| |collapse.limit|maximum number of collapsed docs to return in each collapse group| |collapse.fl|comma- or space- delimited list of fields to return| was (Author: koji): Random comment on the patch: - TimeAllowed not supported - cache not supported - distributed search is not supported - sort field is hard-coded in the patch - collapse.type=adjacent is not supported - collapse.aggregate is not supported (but supportable) - not yet, but collapse.sort can be supported supported parameters: |collapse|set to on to use field collapsing| |collapse.field|field name to collapse (required)| |collapse.limit|maximum number of collapsed docs to return in each collapse group| |collapse.fl|comma- or space- delimited list of fields to return| > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: LOADTEST.patch, SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1773: - Attachment: LOADTEST.patch A very rough/simple load test patch attached. QTime average of 1,000 times random queries were: ||num docs in index||SOLR-236||SOLR-1773|| |1M|321 ms|185ms| |10M|2,914 ms (*)|1,642 ms| (*) I needed to set -Xmx1024m in this case, though 512m for other cases, to avoid OOM. SOLR-1773 is 43% faster. > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: LOADTEST.patch, SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495 ] Koji Sekiguchi commented on SOLR-1773: -- Random comment on the patch: - TimeAllowed not supported - cache not supported - distributed search is not supported - sort field is hard-coded in the patch - collapse.type=adjacent is not supported - collapse.aggregate is not supported (but supportable) - not yet, but collapse.sort can be supported supported parameters: |collapse|set to on to use field collapsing| |collapse.field|field name to collapse (required)| |collapse.limit|maximum number of collapsed docs to return in each collapse group| |collapse.fl|comma- or space- delimited list of fields to return| > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1773) Field Collapsing (lightweight version)
[ https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1773: - Attachment: SOLR-1773.patch The first draft, untested patch. Use for PoC only. In this patch, I hard-coded sort field by using java.util.Comparator. > Field Collapsing (lightweight version) > -- > > Key: SOLR-1773 > URL: https://issues.apache.org/jira/browse/SOLR-1773 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: SOLR-1773.patch > > > I'd like to start another approach for field collapsing suggested by Yonik on > 19/Dec/09 at SOLR-236. Re-posting the idea: > {code} > === two pass collapsing algorithm for collapse.aggregate=max > > First pass: pretend that collapseCount=1 > - Use a TreeSet as a priority queue since one can remove and insert > entries. > - A HashMap will be used to map from collapse group to > top entry in the TreeSet > - compare new doc with smallest element in treeset. If smaller discard and > go to the next doc. > - If new doc is bigger, look up it's group. Use the Map to find if the > group has been added to the TreeSet and add it if not. > - If the new bigger doc is already in the TreeSet, compare with the > document in that group. If bigger, update the node, > remove and re-add to the TreeSet to re-sort. > efficiency: the treeset and hashmap are both only the size of the top number > of docs we are looking at (10 for instance) > We will now have the top 10 documents collapsed by the right field with a > collapseCount of 1. Put another way, we have the top 10 groups. > Second pass (if collapseCount>1): > - create a priority queue for each group (10) of size collapseCount > - re-execute the query (or if the sort within the collapse groups does not > involve score, we could just use the docids gathered during phase 1) > - for each document, find it's appropriate priority queue and insert > - optimization: we can use the previous info from phase1 to even avoid > creating a priority queue if no other items matched. > So instead of creating collapse groups for every group in the set (as is done > now?), we create it for only 10 groups. > Instead of collecting the score for every document in the set (40MB per > request for a 10M doc index is *big*) we re-execute the query if needed. > We could optionally store the score as is done now... but I bet aggregate > throughput on large indexes would be better by just re-executing. > Other thought: we could also cache the first phase in the query cache which > would allow one to quickly move to the 2nd phase for any collapseCount. > {code} > The restriction is: > {quote} > one would not be able to tell the total number of collapsed docs, or the > total number of hits (or the DocSet) after collapsing. So only > collapse.facet=before would be supported. > {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1773) Field Collapsing (lightweight version)
Field Collapsing (lightweight version) -- Key: SOLR-1773 URL: https://issues.apache.org/jira/browse/SOLR-1773 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.4 Reporter: Koji Sekiguchi Priority: Minor I'd like to start another approach for field collapsing suggested by Yonik on 19/Dec/09 at SOLR-236. Re-posting the idea: {code} === two pass collapsing algorithm for collapse.aggregate=max First pass: pretend that collapseCount=1 - Use a TreeSet as a priority queue since one can remove and insert entries. - A HashMap will be used to map from collapse group to top entry in the TreeSet - compare new doc with smallest element in treeset. If smaller discard and go to the next doc. - If new doc is bigger, look up it's group. Use the Map to find if the group has been added to the TreeSet and add it if not. - If the new bigger doc is already in the TreeSet, compare with the document in that group. If bigger, update the node, remove and re-add to the TreeSet to re-sort. efficiency: the treeset and hashmap are both only the size of the top number of docs we are looking at (10 for instance) We will now have the top 10 documents collapsed by the right field with a collapseCount of 1. Put another way, we have the top 10 groups. Second pass (if collapseCount>1): - create a priority queue for each group (10) of size collapseCount - re-execute the query (or if the sort within the collapse groups does not involve score, we could just use the docids gathered during phase 1) - for each document, find it's appropriate priority queue and insert - optimization: we can use the previous info from phase1 to even avoid creating a priority queue if no other items matched. So instead of creating collapse groups for every group in the set (as is done now?), we create it for only 10 groups. Instead of collecting the score for every document in the set (40MB per request for a 10M doc index is *big*) we re-execute the query if needed. We could optionally store the score as is done now... but I bet aggregate throughput on large indexes would be better by just re-executing. Other thought: we could also cache the first phase in the query cache which would allow one to quickly move to the 2nd phase for any collapseCount. {code} The restriction is: {quote} one would not be able to tell the total number of collapsed docs, or the total number of hits (or the DocSet) after collapsing. So only collapse.facet=before would be supported. {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Attachment: SOLR-1268.patch The patch includes: # eliminate hl.useHighlighter parameter # introduce hl.useFastVectorHighlighter parameter. The default is false Therefore, Highlighter will be used unless hl.useFastVectorHighlighter set to true. I'll commit in a few days. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, > SOLR-1268.patch, SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Attachment: SOLR-1268-0_fragsize.patch Hmm, FVH doesn't work appropriately when fragsize=Integer.MAX_SIZE (see test0FragSize() in attached patch. It indicates FVH cannot produce whole snippet when fragsize=Integer.MAX_SIZE). Now I think I should change the (traditional) Highlighter is default even if the highlighting field's termVectors/termPositions/termOffsets are all true, then only when hl.useFastVectorHighlighter is set to true, FVH will be used. hl.useFastVectorHighlighter parameter accepts per-field overrides. Plus FVH doesn't support 0 fragsize. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, > SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search
[ https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1753. -- Resolution: Fixed Committed revision 906781. Thanks Janne! > StatsComponent throws java.lang.NullPointerException when getting statistics > for facets in distributed search > - > > Key: SOLR-1753 > URL: https://issues.apache.org/jira/browse/SOLR-1753 > Project: Solr > Issue Type: Bug >Affects Versions: 1.4 > Environment: Windows >Reporter: Janne Majaranta >Assignee: Koji Sekiguchi > Fix For: 1.5 > > Attachments: SOLR-1753.patch > > > When using the StatsComponent with a sharded request and getting statistics > over facets, a NullPointerException is thrown. > Stacktrace: > java.lang.NullPointerException at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) > at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) > at > org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at > java.lang.Thread.run(Unknown Source) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search
[ https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829914#action_12829914 ] Koji Sekiguchi commented on SOLR-1753: -- Patch looks good! Will commit shortly. > StatsComponent throws java.lang.NullPointerException when getting statistics > for facets in distributed search > - > > Key: SOLR-1753 > URL: https://issues.apache.org/jira/browse/SOLR-1753 > Project: Solr > Issue Type: Bug >Affects Versions: 1.4 > Environment: Windows >Reporter: Janne Majaranta >Assignee: Koji Sekiguchi > Fix For: 1.5 > > Attachments: SOLR-1753.patch > > > When using the StatsComponent with a sharded request and getting statistics > over facets, a NullPointerException is thrown. > Stacktrace: > java.lang.NullPointerException at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) > at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) > at > org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at > java.lang.Thread.run(Unknown Source) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search
[ https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1753: - Affects Version/s: (was: 1.5) Fix Version/s: 1.5 > StatsComponent throws java.lang.NullPointerException when getting statistics > for facets in distributed search > - > > Key: SOLR-1753 > URL: https://issues.apache.org/jira/browse/SOLR-1753 > Project: Solr > Issue Type: Bug >Affects Versions: 1.4 > Environment: Windows >Reporter: Janne Majaranta >Assignee: Koji Sekiguchi > Fix For: 1.5 > > Attachments: SOLR-1753.patch > > > When using the StatsComponent with a sharded request and getting statistics > over facets, a NullPointerException is thrown. > Stacktrace: > java.lang.NullPointerException at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) > at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) > at > org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at > java.lang.Thread.run(Unknown Source) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search
[ https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-1753: Assignee: Koji Sekiguchi > StatsComponent throws java.lang.NullPointerException when getting statistics > for facets in distributed search > - > > Key: SOLR-1753 > URL: https://issues.apache.org/jira/browse/SOLR-1753 > Project: Solr > Issue Type: Bug >Affects Versions: 1.4 > Environment: Windows >Reporter: Janne Majaranta >Assignee: Koji Sekiguchi > Fix For: 1.5 > > Attachments: SOLR-1753.patch > > > When using the StatsComponent with a sharded request and getting statistics > over facets, a NullPointerException is thrown. > Stacktrace: > java.lang.NullPointerException at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) > at > org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) > at > org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at > java.lang.Thread.run(Unknown Source) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829522#action_12829522 ] Koji Sekiguchi commented on SOLR-236: - The following snippet in CollapseComponent.doProcess(): {code} DocListAndSet results = searcher.getDocListAndSet(rb.getQuery(), collapseResult == null ? rb.getFilters() : null, collapseResult.getCollapsedDocset(), rb.getSortSpec().getSort(), rb.getSortSpec().getOffset(), rb.getSortSpec().getCount(), rb.getFieldFlags()); {code} 2nd line implies that collapseResult may be null. If it is null, we got NPE at 3rd line? > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller >Assignee: Shalin Shekhar Mangar > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, > field-collapse-4-with-solrj.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, > field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, > field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, > field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, > SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, > SOLR-236_collapsing.patch, SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828039#action_12828039 ] Koji Sekiguchi commented on SOLR-236: - A random comment, don't we need to check collapse.field is indexed in checkCollapseField()? {code} protected void checkCollapseField(IndexSchema schema) { SchemaField schemaField = schema.getFieldOrNull(collapseField); if (schemaField == null) { throw new RuntimeException("Could not collapse, because collapse field does not exist in the schema."); } if (schemaField.multiValued()) { throw new RuntimeException("Could not collapse, because collapse field is multivalued"); } if (schemaField.getType().isTokenized()) { throw new RuntimeException("Could not collapse, because collapse field is tokenized"); } } {code} I accidentally specified an unindexed field for collapse.field, I got unexpected result without any errors. > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller >Assignee: Shalin Shekhar Mangar > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, > field-collapse-4-with-solrj.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, > field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, > field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, > field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, > SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, > SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Attachment: SOLR-1268-0_fragsize.patch {quote} I have noticed an exception is thrown when using fragSize = 0 (wich should return the whole field highlighted): "fragCharSize(0) is too small. It must be 18 or higher. java.lang.IllegalArgumentException: fragCharSize(0) is too small. It must be 18 or higher" {quote} Thanks, Marc. Solr 1.4 uses NullFragmenter that highlights whole content when you set fragsize to 0. But FVH doesn't have such feature because of using different algorithm. In the attached patch, Solr sets fragsize to Integer.MAX_VALUE if user trys to set 0 when FVH is used. This prevents runtime error. I think it is necessary in Solr level because Solr automatically switch to use FVH when the highlighting field is termVectors/termPositions/termOffsets are all true unless hl.useHighlighter set to true. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268.patch, > SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1731) ArrayIndexOutOfBoundsException when highlighting
[ https://issues.apache.org/jira/browse/SOLR-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804087#action_12804087 ] Koji Sekiguchi commented on SOLR-1731: -- So why don't you uni-gram on both index and query for sku field? {code} {code} {quote} As far as my application cares, those are all equivalent and should just be indexed as: a1280c {quote} To eliminate space/period/hyphen, mapping.txt would look like: {code} " " => "" "." => "" "-" => "" {code} > ArrayIndexOutOfBoundsException when highlighting > > > Key: SOLR-1731 > URL: https://issues.apache.org/jira/browse/SOLR-1731 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 1.4 >Reporter: Tim Underwood >Priority: Minor > > I'm seeing an java.lang.ArrayIndexOutOfBoundsException when trying to > highlight for certain queries. The error seems to be an issue with the > combination of the ShingleFilterFactory, PositionFilterFactory and the > LengthFilterFactory. > Here's my fieldType definition: > omitNorms="true"> > > > generateNumberParts="0" catenateWords="0" catenateNumbers="0" > catenateAll="1"/> > > > > > > >outputUnigrams="true"/> > >generateNumberParts="0" catenateWords="0" catenateNumbers="0" > catenateAll="1"/> > > > > > > Here's the field definition: > omitNorms="true"/> > Here's a sample doc: > > > 1 > A 1280 C > > > Doing a query for sku_new:"A 1280 C" and requesting highlighting throws the > exception (full stack trace below): > http://localhost:8983/solr/select/?q=sku_new%3A%22A+1280+C%22&version=2.2&start=0&rows=10&indent=on&&hl=on&hl.fl=sku_new&fl=* > If I comment out the LengthFilterFactory from my query analyzer section > everything seems to work. Commenting out just the PositionFilterFactory also > makes the exception go away and seems to work for this specific query. > Full stack trace: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:202) > at > org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:414) > at > org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216) > at > org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184) > at > org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:226) > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335) > at > org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) > at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) > at org.mortbay.jetty.Server.handle(Server.java:285) > at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) > at > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) > at > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) -- This message is automatically generated by JIRA. - You can re
[jira] Commented: (SOLR-1731) ArrayIndexOutOfBoundsException when highlighting
[ https://issues.apache.org/jira/browse/SOLR-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803976#action_12803976 ] Koji Sekiguchi commented on SOLR-1731: -- Can't you use WhitespaceTokenizer for index? > ArrayIndexOutOfBoundsException when highlighting > > > Key: SOLR-1731 > URL: https://issues.apache.org/jira/browse/SOLR-1731 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 1.4 >Reporter: Tim Underwood >Priority: Minor > > I'm seeing an java.lang.ArrayIndexOutOfBoundsException when trying to > highlight for certain queries. The error seems to be an issue with the > combination of the ShingleFilterFactory, PositionFilterFactory and the > LengthFilterFactory. > Here's my fieldType definition: > omitNorms="true"> > > > generateNumberParts="0" catenateWords="0" catenateNumbers="0" > catenateAll="1"/> > > > > > > >outputUnigrams="true"/> > >generateNumberParts="0" catenateWords="0" catenateNumbers="0" > catenateAll="1"/> > > > > > > Here's the field definition: > omitNorms="true"/> > Here's a sample doc: > > > 1 > A 1280 C > > > Doing a query for sku_new:"A 1280 C" and requesting highlighting throws the > exception (full stack trace below): > http://localhost:8983/solr/select/?q=sku_new%3A%22A+1280+C%22&version=2.2&start=0&rows=10&indent=on&&hl=on&hl.fl=sku_new&fl=* > If I comment out the LengthFilterFactory from my query analyzer section > everything seems to work. Commenting out just the PositionFilterFactory also > makes the exception go away and seems to work for this specific query. > Full stack trace: > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:202) > at > org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:414) > at > org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216) > at > org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184) > at > org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:226) > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335) > at > org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) > at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) > at org.mortbay.jetty.Server.handle(Server.java:285) > at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) > at > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) > at > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802014#action_12802014 ] Koji Sekiguchi commented on SOLR-1725: -- I like the idea, Uri. I've not looked into the patch yet, it depends on Java 6? I think Solr support Java 5. There is ScriptTransformer in DIH which uses javax.script but it looks like ScriptEngineManager is loaded at runtime. > Script based UpdateRequestProcessorFactory > -- > > Key: SOLR-1725 > URL: https://issues.apache.org/jira/browse/SOLR-1725 > Project: Solr > Issue Type: New Feature > Components: update >Affects Versions: 1.4 >Reporter: Uri Boness > Attachments: SOLR-1725.patch, SOLR-1725.patch > > > A script based UpdateRequestProcessorFactory (Uses JDK6 script engine > support). The main goal of this plugin is to be able to configure/write > update processors without the need to write and package Java code. > The update request processor factory enables writing update processors in > scripts located in {{solr.solr.home}} directory. The functory accepts one > (mandatory) configuration parameter named {{scripts}} which accepts a > comma-separated list of file names. It will look for these files under the > {{conf}} directory in solr home. When multiple scripts are defined, their > execution order is defined by the lexicographical order of the script file > name (so {{scriptA.js}} will be executed before {{scriptB.js}}). > The script language is resolved based on the script file extension (that is, > a *.js files will be treated as a JavaScript script), therefore an extension > is mandatory. > Each script file is expected to have one or more methods with the same > signature as the methods in the {{UpdateRequestProcessor}} interface. It is > *not* required to define all methods, only those hat are required by the > processing logic. > The following variables are define as global variables for each script: > * {{req}} - The SolrQueryRequest > * {{rsp}}- The SolrQueryResponse > * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent
[ https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1696: - Attachment: SOLR-1696.patch A new patch attached. Just to sync with trunk plus warning log when deprecated syntax is found (the idea Chris mentioned above). > Deprecate old syntax and move configuration to > HighlightComponent > > > Key: SOLR-1696 > URL: https://issues.apache.org/jira/browse/SOLR-1696 > Project: Solr > Issue Type: Improvement > Components: highlighter >Reporter: Noble Paul > Fix For: 1.5 > > Attachments: SOLR-1696.patch, SOLR-1696.patch > > > There is no reason why we should have a custom syntax for highlighter > configuration. > It can be treated like any other SearchComponent and all the configuration > can go in there. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent
[ https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798312#action_12798312 ] Koji Sekiguchi commented on SOLR-1696: -- I've just committed SOLR-1268. Now I'm trying to contribute a patch for this to sync with trunk... > Deprecate old syntax and move configuration to > HighlightComponent > > > Key: SOLR-1696 > URL: https://issues.apache.org/jira/browse/SOLR-1696 > Project: Solr > Issue Type: Improvement > Components: highlighter >Reporter: Noble Paul > Fix For: 1.5 > > Attachments: SOLR-1696.patch > > > There is no reason why we should have a custom syntax for highlighter > configuration. > It can be treated like any other SearchComponent and all the configuration > can go in there. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1268. -- Resolution: Fixed Committed revision 897383. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798271#action_12798271 ] Koji Sekiguchi commented on SOLR-1653: -- Thanks, Paul! I've just committed revision 897357. > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch, SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent
[ https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797841#action_12797841 ] Koji Sekiguchi commented on SOLR-1696: -- Noble, thank you for opening this and attaching the patch! Are you planning to commit this shortly? because I'm ready to commit SOLR-1268 that is using old style config. If you commit it, I'll rewrite SOLR-1268. Or I can assign SOLR-1696 to me. > Deprecate old syntax and move configuration to > HighlightComponent > > > Key: SOLR-1696 > URL: https://issues.apache.org/jira/browse/SOLR-1696 > Project: Solr > Issue Type: Improvement > Components: highlighter >Reporter: Noble Paul > Fix For: 1.5 > > Attachments: SOLR-1696.patch > > > There is no reason why we should have a custom syntax for highlighter > configuration. > It can be treated like any other SearchComponent and all the configuration > can go in there. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796147#action_12796147 ] Koji Sekiguchi commented on SOLR-1268: -- I'll commit in a few days if nobody objects. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796075#action_12796075 ] Koji Sekiguchi commented on SOLR-1268: -- I'm introducing and new sub tags of in solrconfig.xml in this patch, rather than . I think we can open a separate ticket for moving settings to , if needed. FYI: http://old.nabble.com/highlighting-setting-in-solrconfig.xml-td26984003.html > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Attachment: SOLR-1268.patch Added a few SolrFragmentsBuilders and test cases. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268.patch, SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Attachment: SOLR-1268.patch First draft, untested patch attached. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1268.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1670) synonymfilter/map repeat bug
[ https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792928#action_12792928 ] Koji Sekiguchi commented on SOLR-1670: -- Robert, sorry, I wanted to say I agree with you regarding "the test for 'repeats' has a flaw". Then "boost TF" was just an input, though I don't know it is intentional feature or side effect. Why don't you fix the flaws in SynonymFilter test in this ticket first, then fix SOLR-1674? (I've not looked into SOLR-1674 yet.) > synonymfilter/map repeat bug > > > Key: SOLR-1670 > URL: https://issues.apache.org/jira/browse/SOLR-1670 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Robert Muir > Attachments: SOLR-1670_test.patch > > > as part of converting tests for SOLR-1657, I ran into a problem with > synonymfilter > the test for 'repeats' has a flaw, it uses this assertTokEqual construct > which does not really validate that two lists of token are equal, it just > stops at the shorted one. > {code} > // repeats > map.add(strings("a b"), tokens("ab"), orig, merge); > map.add(strings("a b"), tokens("ab"), orig, merge); > assertTokEqual(getTokList(map,"a b",false), tokens("ab")); > /* in reality the result from getTokList is ab ab ab! */ > {code} > when converted to assertTokenStreamContents this problem surfaced. attached > is an additional assertion to the existing testcase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1670) synonymfilter/map repeat bug
[ https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792920#action_12792920 ] Koji Sekiguchi commented on SOLR-1670: -- bq. the test for 'repeats' has a flaw, it uses this assertTokEqual construct which does not really validate that two lists of token are equal, it just stops at the shorted one. I agree with you regarding this part. But I'm not sure that the following size() should be 1 in your patch: {code} +assertEquals(1, getTokList(map,"a b",false).size()); {code} If what "repeats" implies is repeating same term intentionally, I think it can boost tf. > synonymfilter/map repeat bug > > > Key: SOLR-1670 > URL: https://issues.apache.org/jira/browse/SOLR-1670 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Robert Muir > Attachments: SOLR-1670_test.patch > > > as part of converting tests for SOLR-1657, I ran into a problem with > synonymfilter > the test for 'repeats' has a flaw, it uses this assertTokEqual construct > which does not really validate that two lists of token are equal, it just > stops at the shorted one. > {code} > // repeats > map.add(strings("a b"), tokens("ab"), orig, merge); > map.add(strings("a b"), tokens("ab"), orig, merge); > assertTokEqual(getTokList(map,"a b",false), tokens("ab")); > /* in reality the result from getTokList is ab ab ab! */ > {code} > when converted to assertTokenStreamContents this problem surfaced. attached > is an additional assertion to the existing testcase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1653. -- Resolution: Fixed Committed revision 890798. Thanks Shalin and Noble for taking time to review the patch. > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch, SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790572#action_12790572 ] Koji Sekiguchi commented on SOLR-1653: -- I see that existing "PatternReplaceFilter" (not CharFilter) is using "pattern". But it uses "replacement", not "replaceWith". I think I use "pattern" and "replacement". > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch, SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1653: - Attachment: SOLR-1653.patch Excuse myself, because I tried to correct offset per group in a match when I started the first patch, I introduced my own syntax. But, yes, now I've implemented the offset correction per match, so I can use standard syntax. Here is the new patch. Usage: {code:title=schema.xml} {code} If there is no objections, I'll commit later today. > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch, SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790127#action_12790127 ] Koji Sekiguchi commented on SOLR-1653: -- bq. I guess this can be achieved with the matcher#replaceAll() directly You're right if we don't correct offset of the output char stream. I need to process one match at a time. > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056 ] Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:30 AM: Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc=1234=5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order of the groups| was (Author: koji): Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order of the groups| > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056 ] Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:28 AM: Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678-abc-1234|change the order of the groups| was (Author: koji): Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)--(\d+)--(\d+)|3,{--},1,{--},2|5678-abc-1234|change the order of the groups| > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056 ] Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:29 AM: Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order of the groups| was (Author: koji): Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678-abc-1234|change the order of the groups| > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056 ] Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:27 AM: Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)--(\d+)--(\d+)|3,{--},1,{--},2|5678-abc-1234|change the order of the groups| was (Author: koji): Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)-(\d+)-(\d+)|3,{-},1,{-},2|5678-abc-1234|change the order of the groups| > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056 ] Koji Sekiguchi commented on SOLR-1653: -- Ok. I'll show you same samples ;-) ||INPUT||groupedPattern||replaceGroups||OUTPUT||comment|| |see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word| |see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be omitted| |No.1 NO. no. 543|[nN][oO]\.\s*(\d+)|{#},1|#1 NO. #543|sample for literal. do not forget to set blockDelimiters other than period when you use period in groupedPattern| |abc-1234-5678|(\w+)-(\d+)-(\d+)|3,{-},1,{-},2|5678-abc-1234|change the order of the groups| > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789957#action_12789957 ] Koji Sekiguchi commented on SOLR-1653: -- I'll commit in a few days. > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-1653: Assignee: Koji Sekiguchi > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1653: - Attachment: SOLR-1653.patch > add PatternReplaceCharFilter > > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > positionIncrementGap="100" > > > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > mapping="mapping-ISOLatin1Accent.txt"/> > > > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1653) add PatternReplaceCharFilter
add PatternReplaceCharFilter Key: SOLR-1653 URL: https://issues.apache.org/jira/browse/SOLR-1653 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 1.4 Reporter: Koji Sekiguchi Priority: Minor Fix For: 1.5 Add a new CharFilter that uses a regular expression for the target of replace string in char stream. Usage: {code:title=schema.xml} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1606) Integrate Near Realtime
[ https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786448#action_12786448 ] Koji Sekiguchi commented on SOLR-1606: -- Jason, I got a failure when running TestRefreshReader. > Integrate Near Realtime > > > Key: SOLR-1606 > URL: https://issues.apache.org/jira/browse/SOLR-1606 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.4 >Reporter: Jason Rutherglen >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1606.patch > > > We'll integrate IndexWriter.getReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1607) use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true
use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true Key: SOLR-1607 URL: https://issues.apache.org/jira/browse/SOLR-1607 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.4 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 1.5 As introducing reopenReaders feature in 1.4, this prevent reload external_[fieldname] and elevate.xml files in dataDir when commit is submitted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1601) Schema browser does not indicate presence of charFilter
[ https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1601. -- Resolution: Fixed Committed revision 884180. Thanks, Jake. > Schema browser does not indicate presence of charFilter > --- > > Key: SOLR-1601 > URL: https://issues.apache.org/jira/browse/SOLR-1601 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Jake Brownell >Assignee: Koji Sekiguchi >Priority: Trivial > Fix For: 1.5 > > Attachments: SOLR-1601.patch > > > My schema has a field defined as: > {noformat} > positionIncrementGap="100"> > > mapping="mapping-ISOLatin1Accent.txt"/> > > words="stopwords.txt" enablePositionIncrements="true" /> > generateWordParts="1" generateNumberParts="1" > catenateWords="1" catenateNumbers="1" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > mapping="mapping-ISOLatin1Accent.txt"/> > > synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > words="stopwords.txt" enablePositionIncrements="true" /> > > generateWordParts="1" generateNumberParts="1" > catenateWords="0" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > {noformat} > and when I view the field in the schema browser, I see: > {noformat} > Tokenized: true > Class Name: org.apache.solr.schema.TextField > Index Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 1 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > Query Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt > expand: true ignoreCase: true } > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 0 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > {noformat} > It's not a big deal, but I expected to see some indication of the charFilter > that is in place. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1601) Schema browser does not indicate presence of charFilter
[ https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1601: - Attachment: SOLR-1601.patch Will commit shortly. > Schema browser does not indicate presence of charFilter > --- > > Key: SOLR-1601 > URL: https://issues.apache.org/jira/browse/SOLR-1601 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Jake Brownell >Assignee: Koji Sekiguchi >Priority: Trivial > Fix For: 1.5 > > Attachments: SOLR-1601.patch > > > My schema has a field defined as: > {noformat} > positionIncrementGap="100"> > > mapping="mapping-ISOLatin1Accent.txt"/> > > words="stopwords.txt" enablePositionIncrements="true" /> > generateWordParts="1" generateNumberParts="1" > catenateWords="1" catenateNumbers="1" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > mapping="mapping-ISOLatin1Accent.txt"/> > > synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > words="stopwords.txt" enablePositionIncrements="true" /> > > generateWordParts="1" generateNumberParts="1" > catenateWords="0" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > {noformat} > and when I view the field in the schema browser, I see: > {noformat} > Tokenized: true > Class Name: org.apache.solr.schema.TextField > Index Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 1 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > Query Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt > expand: true ignoreCase: true } > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 0 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > {noformat} > It's not a big deal, but I expected to see some indication of the charFilter > that is in place. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1601) Schema browser does not indicate presence of charFilter
[ https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1601: - Component/s: Schema and Analysis Affects Version/s: 1.4 Fix Version/s: 1.5 Assignee: Koji Sekiguchi > Schema browser does not indicate presence of charFilter > --- > > Key: SOLR-1601 > URL: https://issues.apache.org/jira/browse/SOLR-1601 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 1.4 >Reporter: Jake Brownell >Assignee: Koji Sekiguchi >Priority: Trivial > Fix For: 1.5 > > > My schema has a field defined as: > {noformat} > positionIncrementGap="100"> > > mapping="mapping-ISOLatin1Accent.txt"/> > > words="stopwords.txt" enablePositionIncrements="true" /> > generateWordParts="1" generateNumberParts="1" > catenateWords="1" catenateNumbers="1" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > mapping="mapping-ISOLatin1Accent.txt"/> > > synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > words="stopwords.txt" enablePositionIncrements="true" /> > > generateWordParts="1" generateNumberParts="1" > catenateWords="0" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="1" /> > > protected="protwords.txt" /> > > > > > {noformat} > and when I view the field in the schema browser, I see: > {noformat} > Tokenized: true > Class Name: org.apache.solr.schema.TextField > Index Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 1 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > Query Analyzer: org.apache.solr.analysis.TokenizerChain > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt > expand: true ignoreCase: true } > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: > 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 > catenateNumbers: 0 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: > protwords.txt } > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > {noformat} > It's not a big deal, but I expected to see some indication of the charFilter > that is in place. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)
[ https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1489: - Attachment: SOLR-1489.patch Attached patch fixes the above failure, but I got another failure (no expires header): {code} Testcase: testCacheVetoHandler took 3.29 sec Testcase: testCacheVetoException took 1.395 sec FAILED We got no Expires header junit.framework.AssertionFailedError: We got no Expires header at org.apache.solr.servlet.CacheHeaderTest.checkVetoHeaders(CacheHeaderTest.java:73) at org.apache.solr.servlet.CacheHeaderTest.testCacheVetoException(CacheHeaderTest.java:59) Testcase: testLastModified took 1.485 sec Testcase: testEtag took 1.577 sec Testcase: testCacheControl took 1.035 sec {code} > A UTF-8 character is output twice (Bug in Jetty) > > > Key: SOLR-1489 > URL: https://issues.apache.org/jira/browse/SOLR-1489 > Project: Solr > Issue Type: Bug > Environment: Jetty-6.1.3 > Jetty-6.1.21 > Jetty-7.0.0RC6 >Reporter: Jun Ohtani >Assignee: Koji Sekiguchi >Priority: Critical > Attachments: error_utf8-example.xml, jetty-6.1.22.jar, > jetty-util-6.1.22.jar, jettybugsample.war, jsp-2.1.zip, > servlet-api-2.5-20081211.jar, SOLR-1489.patch > > > A UTF-8 character is output twice under particular conditions. > Attach the sample data.(error_utf8-example.xml) > Registered only sample data, click the following URL. > http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json > Sample data is only "B", but response is "BB". > When wt=phps, error occurs in PHP unsrialize() function. > This bug is like a bug in Jetty. > jettybugsample.war is the simplest one to reproduce the problem. > Copy example/webapps, and start Jetty server, and click the following URL. > http://localhost:8983/jettybugsample/filter/hoge > Like earlier, B is output twice. Sysout only B once. > I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6. > (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in > web.xml. ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)
[ https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782335#action_12782335 ] Koji Sekiguchi commented on SOLR-1489: -- Thanks, Ohtani-san. Using these new jetty jars (6.1.22), I run ant test, but I got a failure: {code:title=TEST-org.apache.solr.servlet.CacheHeaderTest.txt} Testcase: testCacheVetoHandler took 2.469 sec Testcase: testCacheVetoException took 1.25 sec FAILED null expected:<[no-cache, ]no-store> but was:<[must-revalidate,no-cache,]no-store> junit.framework.ComparisonFailure: null expected:<[no-cache, ]no-store> but was:<[must-revalidate,no-cache,]no-store> at org.apache.solr.servlet.CacheHeaderTest.checkVetoHeaders(CacheHeaderTest.java:65) at org.apache.solr.servlet.CacheHeaderTest.testCacheVetoException(CacheHeaderTest.java:59) Testcase: testLastModified took 1.188 sec Testcase: testEtag took 1.11 sec Testcase: testCacheControl took 1.391 sec {code} According to SOLR-632, the cache header related test was failed when we used jetty-6.1.11, Lars filed https://jira.codehaus.org/browse/JETTY-646. Now the issue has been fixed, I thought jetty-6.1.22 should work. I've not looked into the details of cache header test, though. > A UTF-8 character is output twice (Bug in Jetty) > > > Key: SOLR-1489 > URL: https://issues.apache.org/jira/browse/SOLR-1489 > Project: Solr > Issue Type: Bug > Environment: Jetty-6.1.3 > Jetty-6.1.21 > Jetty-7.0.0RC6 >Reporter: Jun Ohtani >Assignee: Koji Sekiguchi >Priority: Critical > Attachments: error_utf8-example.xml, jetty-6.1.22.jar, > jetty-util-6.1.22.jar, jettybugsample.war, jsp-2.1.zip, > servlet-api-2.5-20081211.jar > > > A UTF-8 character is output twice under particular conditions. > Attach the sample data.(error_utf8-example.xml) > Registered only sample data, click the following URL. > http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json > Sample data is only "B", but response is "BB". > When wt=phps, error occurs in PHP unsrialize() function. > This bug is like a bug in Jetty. > jettybugsample.war is the simplest one to reproduce the problem. > Copy example/webapps, and start Jetty server, and click the following URL. > http://localhost:8983/jettybugsample/filter/hoge > Like earlier, B is output twice. Sysout only B once. > I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6. > (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in > web.xml. ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)
[ https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779814#action_12779814 ] Koji Sekiguchi commented on SOLR-1489: -- Ok, http://jira.codehaus.org/browse/JETTY-1122 has been marked as fixed and jetty 6.1.22 released. Ohtani-san, can you test the new jetty with your test case to see the bug is gone? Thanks. > A UTF-8 character is output twice (Bug in Jetty) > > > Key: SOLR-1489 > URL: https://issues.apache.org/jira/browse/SOLR-1489 > Project: Solr > Issue Type: Bug > Environment: Jetty-6.1.3 > Jetty-6.1.21 > Jetty-7.0.0RC6 >Reporter: Jun Ohtani >Assignee: Koji Sekiguchi >Priority: Critical > Attachments: error_utf8-example.xml, jettybugsample.war > > > A UTF-8 character is output twice under particular conditions. > Attach the sample data.(error_utf8-example.xml) > Registered only sample data, click the following URL. > http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json > Sample data is only "B", but response is "BB". > When wt=phps, error occurs in PHP unsrialize() function. > This bug is like a bug in Jetty. > jettybugsample.war is the simplest one to reproduce the problem. > Copy example/webapps, and start Jetty server, and click the following URL. > http://localhost:8983/jettybugsample/filter/hoge > Like earlier, B is output twice. Sysout only B once. > I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6. > (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in > web.xml. ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1506) Search multiple cores using MultiReader
[ https://issues.apache.org/jira/browse/SOLR-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773213#action_12773213 ] Koji Sekiguchi commented on SOLR-1506: -- bq. Commit doesn't work because reopen isn't supported by MultiReader. Regarding MultiReader and reopen, I've set reopenReaders to false: {code:title=solrconfig.xml} false : {code} > Search multiple cores using MultiReader > --- > > Key: SOLR-1506 > URL: https://issues.apache.org/jira/browse/SOLR-1506 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Jason Rutherglen >Priority: Trivial > Fix For: 1.5 > > Attachments: SOLR-1506.patch, SOLR-1506.patch > > > I need to search over multiple cores, and SOLR-1477 is more > complicated than expected, so here we'll create a MultiReader > over the cores to allow searching on them. > Maybe in the future we can add parallel searching however > SOLR-1477, if it gets completed, provides that out of the box. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-822) CharFilter - normalize characters before tokenizer
[ https://issues.apache.org/jira/browse/SOLR-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769741#action_12769741 ] Koji Sekiguchi commented on SOLR-822: - bq. Please update the Wiki for this feature. Done. :) > CharFilter - normalize characters before tokenizer > -- > > Key: SOLR-822 > URL: https://issues.apache.org/jira/browse/SOLR-822 > Project: Solr > Issue Type: New Feature > Components: Analysis >Affects Versions: 1.3 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: character-normalization.JPG, > japanese-h-to-k-mapping.txt, sample_mapping_ja.txt, sample_mapping_ja.txt, > SOLR-822-for-1.3.patch, SOLR-822-renameMethod.patch, SOLR-822.patch, > SOLR-822.patch, SOLR-822.patch, SOLR-822.patch, SOLR-822.patch > > > A new plugin which can be placed in front of . > {code:xml} > positionIncrementGap="100" > > > mapping="mapping_ja.txt" /> > > words="stopwords.txt"/> > > > > {code} > can be multiple (chained). I'll post a JPEG file to show > character normalization sample soon. > MOTIVATION: > In Japan, there are two types of tokenizers -- N-gram (CJKTokenizer) and > Morphological Analyzer. > When we use morphological analyzer, because the analyzer uses Japanese > dictionary to detect terms, > we need to normalize characters before tokenization. > I'll post a patch soon, too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-551) Solr replication should include the schema also
[ https://issues.apache.org/jira/browse/SOLR-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-551: Component/s: (was: replication (scripts)) replication (java) change component from scripts to java > Solr replication should include the schema also > --- > > Key: SOLR-551 > URL: https://issues.apache.org/jira/browse/SOLR-551 > Project: Solr > Issue Type: Improvement > Components: replication (java) >Affects Versions: 1.4 >Reporter: Noble Paul >Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > > The current Solr replication just copy the data directory . So if the > schema changes and I do a re-index it will blissfully copy the index > and the slaves will fail because of incompatible schema. > So the steps we follow are > * Stop rsync on slaves > * Update the master with new schema > * re-index data > * forEach slave > ** Kill the slave > ** clean the data directory > ** install the new schema > ** restart > ** do a manual snappull > The amount of work the admin needs to do is quite significant > (depending on the no:of slaves). These are manual steps and very error > prone > The solution : > Make the replication mechanism handle the schema replication also. So > all I need to do is to just change the master and the slaves synch > automatically > What is a good way to implement this? > We have an idea along the following lines > This should involve changes to the snapshooter and snappuller scripts > and the snapinstaller components > Everytime the snapshooter takes a snapshot it must keep the timestamps > of schema.xml and elevate.xml (all the files which might affect the > runtime behavior in slaves) > For subsequent snapshots if the timestamps of any of them is changed > it must copy the all of them also for replication. > The snappuller copies the new directory as usual > The snapinstaller checks if these config files are present , > if yes, > * It can create a temporary core > * install the changed index and configuration > * load it completely and swap it out with the original core -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-561) Solr replication by Solr (for windows also)
[ https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-561: Component/s: (was: replication (scripts)) replication (java) change component from scripts to java > Solr replication by Solr (for windows also) > --- > > Key: SOLR-561 > URL: https://issues.apache.org/jira/browse/SOLR-561 > Project: Solr > Issue Type: New Feature > Components: replication (java) >Affects Versions: 1.4 > Environment: All >Reporter: Noble Paul >Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > Attachments: deletion_policy.patch, SOLR-561-core.patch, > SOLR-561-fixes.patch, SOLR-561-fixes.patch, SOLR-561-fixes.patch, > SOLR-561-full.patch, SOLR-561-full.patch, SOLR-561-full.patch, > SOLR-561-full.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, > SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, > SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, > SOLR-561.patch, SOLR-561.patch, SOLR-561.patch > > > The current replication strategy in solr involves shell scripts . The > following are the drawbacks with the approach > * It does not work with windows > * Replication works as a separate piece not integrated with solr. > * Cannot control replication from solr admin/JMX > * Each operation requires manual telnet to the host > Doing the replication in java has the following advantages > * Platform independence > * Manual steps can be completely eliminated. Everything can be driven from > solrconfig.xml . > ** Adding the url of the master in the slaves should be good enough to enable > replication. Other things like frequency of > snapshoot/snappull can also be configured . All other information can be > automatically obtained. > * Start/stop can be triggered from solr/admin or JMX > * Can get the status/progress while replication is going on. It can also > abort an ongoing replication > * No need to have a login into the machine > * From a development perspective, we can unit test it > This issue can track the implementation of solr replication in java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1099. -- Resolution: Fixed Committed revision 827032. Thanks. > FieldAnalysisRequestHandler > --- > > Key: SOLR-1099 > URL: https://issues.apache.org/jira/browse/SOLR-1099 > Project: Solr > Issue Type: New Feature > Components: Analysis >Affects Versions: 1.3 >Reporter: Uri Boness >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: AnalisysRequestHandler_refactored.patch, > analysis_request_handlers_incl_solrj.patch, > AnalysisRequestHandler_refactored1.patch, > FieldAnalysisRequestHandler_incl_test.patch, > SOLR-1099-ordered-TokenizerChain.patch, SOLR-1099.patch, SOLR-1099.patch, > SOLR-1099.patch > > > The FieldAnalysisRequestHandler provides the analysis functionality of the > web admin page as a service. This handler accepts a filetype/fieldname > parameter and a value and as a response returns a breakdown of the analysis > process. It is also possible to send a query value which will use the > configured query analyzer as well as a showmatch parameter which will then > mark every matched token as a match. > If this handler is added to the code base, I also recommend to rename the > current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have > them both inherit from one AnalysisRequestHandlerBase class which provides > the common functionality of the analysis breakdown and its translation to > named lists. This will also enhance the current AnalysisRequestHandler which > right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1099: - Attachment: SOLR-1099-ordered-TokenizerChain.patch I'd like to use NamedList rather than SimpleOrderedMap. If there is no objections, I'll commit soon. All tests pass. > FieldAnalysisRequestHandler > --- > > Key: SOLR-1099 > URL: https://issues.apache.org/jira/browse/SOLR-1099 > Project: Solr > Issue Type: New Feature > Components: Analysis >Affects Versions: 1.3 >Reporter: Uri Boness >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: AnalisysRequestHandler_refactored.patch, > analysis_request_handlers_incl_solrj.patch, > AnalysisRequestHandler_refactored1.patch, > FieldAnalysisRequestHandler_incl_test.patch, > SOLR-1099-ordered-TokenizerChain.patch, SOLR-1099.patch, SOLR-1099.patch, > SOLR-1099.patch > > > The FieldAnalysisRequestHandler provides the analysis functionality of the > web admin page as a service. This handler accepts a filetype/fieldname > parameter and a value and as a response returns a breakdown of the analysis > process. It is also possible to send a query value which will use the > configured query analyzer as well as a showmatch parameter which will then > mark every matched token as a match. > If this handler is added to the code base, I also recommend to rename the > current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have > them both inherit from one AnalysisRequestHandlerBase class which provides > the common functionality of the analysis breakdown and its translation to > named lists. This will also enhance the current AnalysisRequestHandler which > right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reopened SOLR-1099: -- Assignee: Koji Sekiguchi (was: Shalin Shekhar Mangar) Hmm, I think the order of Tokenizer/TokenFilters in response is unconsidered. For example, I cannot take out Tokenizer/TokenFilters from ruby response in order... > FieldAnalysisRequestHandler > --- > > Key: SOLR-1099 > URL: https://issues.apache.org/jira/browse/SOLR-1099 > Project: Solr > Issue Type: New Feature > Components: Analysis >Affects Versions: 1.3 >Reporter: Uri Boness >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: AnalisysRequestHandler_refactored.patch, > analysis_request_handlers_incl_solrj.patch, > AnalysisRequestHandler_refactored1.patch, > FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, > SOLR-1099.patch, SOLR-1099.patch > > > The FieldAnalysisRequestHandler provides the analysis functionality of the > web admin page as a service. This handler accepts a filetype/fieldname > parameter and a value and as a response returns a breakdown of the analysis > process. It is also possible to send a query value which will use the > configured query analyzer as well as a showmatch parameter which will then > mark every matched token as a match. > If this handler is added to the code base, I also recommend to rename the > current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have > them both inherit from one AnalysisRequestHandlerBase class which provides > the common functionality of the analysis breakdown and its translation to > named lists. This will also enhance the current AnalysisRequestHandler which > right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1515) Javadoc typo in SolrQueryResponse
[ https://issues.apache.org/jira/browse/SOLR-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1515. -- Resolution: Fixed Committed revision 826321. Thanks. > Javadoc typo in SolrQueryResponse > - > > Key: SOLR-1515 > URL: https://issues.apache.org/jira/browse/SOLR-1515 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.3 > Environment: my local MacBook pro >Reporter: Chris A. Mattmann >Priority: Trivial > Fix For: 1.4 > > Attachments: SOLR-1515.101709.Mattmann.patch.txt > > > There is a minute typo in the javadoc for > o.a.s.request.SolrQueryResponse.java. This patch fixes that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1515) Javadoc typo in SolrQueryResponse
[ https://issues.apache.org/jira/browse/SOLR-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1515: - Fix Version/s: (was: 1.5) 1.4 > Javadoc typo in SolrQueryResponse > - > > Key: SOLR-1515 > URL: https://issues.apache.org/jira/browse/SOLR-1515 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.3 > Environment: my local MacBook pro >Reporter: Chris A. Mattmann >Priority: Trivial > Fix For: 1.4 > > Attachments: SOLR-1515.101709.Mattmann.patch.txt > > > There is a minute typo in the javadoc for > o.a.s.request.SolrQueryResponse.java. This patch fixes that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-670) UpdateHandler must provide a rollback feature
[ https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-670. - Resolution: Fixed Committed revision 824380. > UpdateHandler must provide a rollback feature > - > > Key: SOLR-670 > URL: https://issues.apache.org/jira/browse/SOLR-670 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Noble Paul >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-670-revert-cumulative-counts.patch, SOLR-670.patch, > SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, SOLR-670.patch > > > Lucene IndexWriter already has a rollback method. There should be a > counterpart for the same in _UpdateHandler_ so that users can do a rollback > over http -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-670) UpdateHandler must provide a rollback feature
[ https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-670: Attachment: SOLR-670-revert-cumulative-counts.patch The fix and test case. I'll commit soon. > UpdateHandler must provide a rollback feature > - > > Key: SOLR-670 > URL: https://issues.apache.org/jira/browse/SOLR-670 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Noble Paul >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-670-revert-cumulative-counts.patch, SOLR-670.patch, > SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, SOLR-670.patch > > > Lucene IndexWriter already has a rollback method. There should be a > counterpart for the same in _UpdateHandler_ so that users can do a rollback > over http -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (SOLR-670) UpdateHandler must provide a rollback feature
[ https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reopened SOLR-670: - Assignee: Koji Sekiguchi (was: Shalin Shekhar Mangar) Rollback should reset not only adds/deletesById/deletesByQuery counts but also cumulative counts of them. > UpdateHandler must provide a rollback feature > - > > Key: SOLR-670 > URL: https://issues.apache.org/jira/browse/SOLR-670 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Noble Paul >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, > SOLR-670.patch, SOLR-670.patch > > > Lucene IndexWriter already has a rollback method. There should be a > counterpart for the same in _UpdateHandler_ so that users can do a rollback > over http -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.
[ https://issues.apache.org/jira/browse/SOLR-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1504. -- Resolution: Fixed Committed revision 824045. > empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp > and co. > --- > > Key: SOLR-1504 > URL: https://issues.apache.org/jira/browse/SOLR-1504 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1504.patch > > > If you have the following mapping rule in mapping.txt: > {code} > # destination can be empty > "NULL" => "" > {code} > you can get AIOOBE by specifying NULL for either index or query data in the > input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and > FieldAnalysisRequestHandler). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.
[ https://issues.apache.org/jira/browse/SOLR-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1504: - Attachment: SOLR-1504.patch A patch for the fix. Will commit soon. > empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp > and co. > --- > > Key: SOLR-1504 > URL: https://issues.apache.org/jira/browse/SOLR-1504 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1504.patch > > > If you have the following mapping rule in mapping.txt: > {code} > # destination can be empty > "NULL" => "" > {code} > you can get AIOOBE by specifying NULL for either index or query data in the > input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and > FieldAnalysisRequestHandler). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.
empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co. --- Key: SOLR-1504 URL: https://issues.apache.org/jira/browse/SOLR-1504 Project: Solr Issue Type: Bug Components: Analysis Affects Versions: 1.4 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 1.4 If you have the following mapping rule in mapping.txt: {code} # destination can be empty "NULL" => "" {code} you can get AIOOBE by specifying NULL for either index or query data in the input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and FieldAnalysisRequestHandler). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1268: - Fix Version/s: 1.5 Mark it to 1.5 because there is no patches. > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.5 > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter
[ https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-1268: Assignee: Koji Sekiguchi > Incorporate Lucene's FastVectorHighlighter > -- > > Key: SOLR-1268 > URL: https://issues.apache.org/jira/browse/SOLR-1268 > Project: Solr > Issue Type: New Feature > Components: highlighter >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)
[ https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-1489: Assignee: Koji Sekiguchi > A UTF-8 character is output twice (Bug in Jetty) > > > Key: SOLR-1489 > URL: https://issues.apache.org/jira/browse/SOLR-1489 > Project: Solr > Issue Type: Bug > Environment: Jetty-6.1.3 > Jetty-6.1.21 > Jetty-7.0.0RC6 >Reporter: Jun Ohtani >Assignee: Koji Sekiguchi >Priority: Critical > Attachments: error_utf8-example.xml, jettybugsample.war > > > A UTF-8 character is output twice under particular conditions. > Attach the sample data.(error_utf8-example.xml) > Registered only sample data, click the following URL. > http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json > Sample data is only "B", but response is "BB". > When wt=phps, error occurs in PHP unsrialize() function. > This bug is like a bug in Jetty. > jettybugsample.war is the simplest one to reproduce the problem. > Copy example/webapps, and start Jetty server, and click the following URL. > http://localhost:8983/jettybugsample/filter/hoge > Like earlier, B is output twice. Sysout only B once. > I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6. > (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in > web.xml. ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)
[ https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761900#action_12761900 ] Koji Sekiguchi commented on SOLR-1489: -- Good catch, Otani-san! I can reproduce the problem with the data and the filter you attached when running it on Jetty. And thank you for opening the JIRA ticket in Jetty. Now we are closing to releasing 1.4, I don't want this to be a blocker because this is not a Solr bug as you said. You can run Solr on arbitrary servlet containers other than Jetty if you'd like. I'd like to keep this opening, and watching http://jira.codehaus.org/browse/JETTY-1122 . Thanks. > A UTF-8 character is output twice (Bug in Jetty) > > > Key: SOLR-1489 > URL: https://issues.apache.org/jira/browse/SOLR-1489 > Project: Solr > Issue Type: Bug > Environment: Jetty-6.1.3 > Jetty-6.1.21 > Jetty-7.0.0RC6 >Reporter: Jun Ohtani >Priority: Critical > Attachments: error_utf8-example.xml, jettybugsample.war > > > A UTF-8 character is output twice under particular conditions. > Attach the sample data.(error_utf8-example.xml) > Registered only sample data, click the following URL. > http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json > Sample data is only "B", but response is "BB". > When wt=phps, error occurs in PHP unsrialize() function. > This bug is like a bug in Jetty. > jettybugsample.war is the simplest one to reproduce the problem. > Copy example/webapps, and start Jetty server, and click the following URL. > http://localhost:8983/jettybugsample/filter/hoge > Like earlier, B is output twice. Sysout only B once. > I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6. > (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in > web.xml. ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1481) phps writer ignores omitHeader parameter
phps writer ignores omitHeader parameter Key: SOLR-1481 URL: https://issues.apache.org/jira/browse/SOLR-1481 Project: Solr Issue Type: Bug Components: search Reporter: Koji Sekiguchi Priority: Trivial Fix For: 1.4 My co-worker found this one. I'm expecting a patch will be attached soon by him. :) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others
[ https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1423. -- Resolution: Fixed Committed revision 816502. Thanks, Uwe! > Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & > others > > > Key: SOLR-1423 > URL: https://issues.apache.org/jira/browse/SOLR-1423 > Project: Solr > Issue Type: Task > Components: Analysis >Affects Versions: 1.4 >Reporter: Uwe Schindler >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-1423-FieldType.patch, > SOLR-1423-fix-empty-tokens.patch, SOLR-1423-fix-empty-tokens.patch, > SOLR-1423-with-empty-tokens.patch, SOLR-1423.patch, SOLR-1423.patch, > SOLR-1423.patch > > > Because of some backwards compatibility problems (LUCENE-1906) we changed the > CharStream/CharFilter API a little bit. Tokenizer now only has a input field > of type java.io.Reader (as before the CharStream code). To correct offsets, > it is now needed to call the Tokenizer.correctOffset(int) method, which > delegates to the CharStream (if input is subclass of CharStream), else > returns an uncorrected offset. Normally it is enough to change all occurences > of input.correctOffset() to this.correctOffset() in Tokenizers. It should > also be checked, if custom Tokenizers in Solr do correct their offsets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others
[ https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756923#action_12756923 ] Koji Sekiguchi commented on SOLR-1423: -- The patch looks good! Will commit shortly. > Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & > others > > > Key: SOLR-1423 > URL: https://issues.apache.org/jira/browse/SOLR-1423 > Project: Solr > Issue Type: Task > Components: Analysis >Affects Versions: 1.4 >Reporter: Uwe Schindler >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-1423-FieldType.patch, > SOLR-1423-fix-empty-tokens.patch, SOLR-1423-fix-empty-tokens.patch, > SOLR-1423-with-empty-tokens.patch, SOLR-1423.patch, SOLR-1423.patch, > SOLR-1423.patch > > > Because of some backwards compatibility problems (LUCENE-1906) we changed the > CharStream/CharFilter API a little bit. Tokenizer now only has a input field > of type java.io.Reader (as before the CharStream code). To correct offsets, > it is now needed to call the Tokenizer.correctOffset(int) method, which > delegates to the CharStream (if input is subclass of CharStream), else > returns an uncorrected offset. Normally it is enough to change all occurences > of input.correctOffset() to this.correctOffset() in Tokenizers. It should > also be checked, if custom Tokenizers in Solr do correct their offsets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others
[ https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1423: - Attachment: SOLR-1423.patch The patch that is Uwe's one with replacing split()/group() methods. bq. Why does the PatternTokenizer does not have the methods newToken and so on in its own class Yeah, I'd realized it immediately after posting the patch, but I was going to be out. And thank you for adapting it for new TokenStream API. bq. I searched for setOffset() in Solr source code and found one additional occurence of it without offset correcting in FieldType.java. This patch fixes this. Good catch, Uwe! I slipped over it. I think the empty tokens is a bug and should be omitted in this patch. bq. A second thing: Lucene has a new BaseTokenStreamTest class for checking tokens without Token instances (which would no loger work, when Lucene 3.0 switches to Attributes only). Maybe you should update these test and use assertAnalyzesTo from the new base class instead. Very nice! Can you open a separate ticket? > Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & > others > > > Key: SOLR-1423 > URL: https://issues.apache.org/jira/browse/SOLR-1423 > Project: Solr > Issue Type: Task > Components: Analysis >Affects Versions: 1.4 >Reporter: Uwe Schindler >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-1423-FieldType.patch, SOLR-1423.patch, > SOLR-1423.patch, SOLR-1423.patch > > > Because of some backwards compatibility problems (LUCENE-1906) we changed the > CharStream/CharFilter API a little bit. Tokenizer now only has a input field > of type java.io.Reader (as before the CharStream code). To correct offsets, > it is now needed to call the Tokenizer.correctOffset(int) method, which > delegates to the CharStream (if input is subclass of CharStream), else > returns an uncorrected offset. Normally it is enough to change all occurences > of input.correctOffset() to this.correctOffset() in Tokenizers. It should > also be checked, if custom Tokenizers in Solr do correct their offsets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others
[ https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1423: - Attachment: SOLR-1423.patch I thought I call tokenizer.correctOffset() in newToken() method, but I couldn't because the method is protected. In this patch, I converted the anonymous Tokenizer class to PatternTokenizer, and PatternTokenizer has the following: {code} +public int correct( int currentOffset ){ + return correctOffset( currentOffset ); +} {code} > Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & > others > > > Key: SOLR-1423 > URL: https://issues.apache.org/jira/browse/SOLR-1423 > Project: Solr > Issue Type: Task > Components: Analysis >Affects Versions: 1.4 >Reporter: Uwe Schindler >Assignee: Koji Sekiguchi > Fix For: 1.4 > > Attachments: SOLR-1423.patch > > > Because of some backwards compatibility problems (LUCENE-1906) we changed the > CharStream/CharFilter API a little bit. Tokenizer now only has a input field > of type java.io.Reader (as before the CharStream code). To correct offsets, > it is now needed to call the Tokenizer.correctOffset(int) method, which > delegates to the CharStream (if input is subclass of CharStream), else > returns an uncorrected offset. Normally it is enough to change all occurences > of input.correctOffset() to this.correctOffset() in Tokenizers. It should > also be checked, if custom Tokenizers in Solr do correct their offsets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others
[ https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1423: - Affects Version/s: 1.4 Fix Version/s: 1.4 Assignee: Koji Sekiguchi I'd like to check it before 1.4 release. I'll look into it once RC4 is checked in Solr. > Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & > others > > > Key: SOLR-1423 > URL: https://issues.apache.org/jira/browse/SOLR-1423 > Project: Solr > Issue Type: Task > Components: Analysis >Affects Versions: 1.4 >Reporter: Uwe Schindler >Assignee: Koji Sekiguchi > Fix For: 1.4 > > > Because of some backwards compatibility problems (LUCENE-1906) we changed the > CharStream/CharFilter API a little bit. Tokenizer now only has a input field > of type java.io.Reader (as before the CharStream code). To correct offsets, > it is now needed to call the Tokenizer.correctOffset(int) method, which > delegates to the CharStream (if input is subclass of CharStream), else > returns an uncorrected offset. Normally it is enough to change all occurences > of input.correctOffset() to this.correctOffset() in Tokenizers. It should > also be checked, if custom Tokenizers in Solr do correct their offsets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1404) Random failures with highlighting
[ https://issues.apache.org/jira/browse/SOLR-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753686#action_12753686 ] Koji Sekiguchi commented on SOLR-1404: -- bq. A better fix, perhaps, would be implementing reset(CharStream input) in CharTokenizer in Lucene. Will LUCENE-1906 fix it (in an alternate way)? > Random failures with highlighting > - > > Key: SOLR-1404 > URL: https://issues.apache.org/jira/browse/SOLR-1404 > Project: Solr > Issue Type: Bug > Components: Analysis, highlighter >Affects Versions: 1.4 >Reporter: Anders Melchiorsen > Fix For: 1.4 > > Attachments: SOLR-1404.patch > > > With a recent Solr nightly, we started getting errors when highlighting. > I have not been able to reduce our real setup to a minimal one that is > failing, but the same error seems to pop up with the configuration below. > Note that the QUERY will mostly fail, but it will work sometimes. Notably, > after running "java -jar start.jar", the QUERY will work the first time, but > then start failing for a while. Seems that something is not being reset > properly. > The example uses the deprecated HTMLStripWhitespaceTokenizerFactory but the > problem apparently also exists with other tokenizers; I was just unable to > create a minimal example with other configurations. > SCHEMA > > > > > > > > > > > > > > > id > > INDEX > URL=http://localhost:8983/solr/update > curl $URL --data-binary '1 name="test">test' -H 'Content-type:text/xml; > charset=utf-8' > curl $URL --data-binary '' -H 'Content-type:text/xml; charset=utf-8' > QUERY > curl 'http://localhost:8983/solr/select/?hl.fl=test&hl=true&q=id:1' > ERROR > org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test > exceeds length of provided text sized 4 > org.apache.solr.common.SolrException: > org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test > exceeds length of provided text sized 4 > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:328) > at > org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) > at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) > at org.mortbay.jetty.Server.handle(Server.java:285) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) > at > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) > at > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) > Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: > Token test exceeds length of provided text sized 4 > at > org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254) > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:321) > ... 23 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1398. -- Resolution: Fixed Committed revision 811753. > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1398.patch, SOLR-1398.patch > > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1398: - Attachment: SOLR-1398.patch a new patch with test case. Will commit shortly. > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1398.patch, SOLR-1398.patch > > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750369#action_12750369 ] Koji Sekiguchi commented on SOLR-1398: -- Anders, thank you for testing the patch and reporting the result. Yes, I think the error is a separate issue. Can you show the procedure (schema.xml, indexed data and request parameters) to reproduce the error? I tried to index "G& uuml;nther G& uuml;nther is here" and search "Günther", but I could get a highlighted result successfully. > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1398.patch > > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1398: - Priority: Minor (was: Major) Fix Version/s: 1.4 Assignee: Koji Sekiguchi > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1398.patch > > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1398: - Attachment: SOLR-1398.patch Anders, can you apply the patch and see the highlighted result? > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen > Attachments: SOLR-1398.patch > > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1398) PatternTokenizerFactory ignores offset corrections
[ https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749493#action_12749493 ] Koji Sekiguchi commented on SOLR-1398: -- Anders, thank you for reporting the problem. Can you show a concrete case so I can reproduce the problem? > PatternTokenizerFactory ignores offset corrections > -- > > Key: SOLR-1398 > URL: https://issues.apache.org/jira/browse/SOLR-1398 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Anders Melchiorsen > > I have an analyzer with a MappingCharFilterFactory followed by a > PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights. > Replacing the tokenizer with WhitespaceTokenizerFactory gives correct > offsets, so I expect the problem to be with PatternTokenizerFactory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1370. -- Resolution: Fixed Thanks Erik! Committed revision 805880. > call CharFilters in FieldAnalysisRequestHandler > --- > > Key: SOLR-1370 > URL: https://issues.apache.org/jira/browse/SOLR-1370 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1370.patch > > > Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if > CharFilters are defined for the fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1370: - Fix Version/s: 1.4 > call CharFilters in FieldAnalysisRequestHandler > --- > > Key: SOLR-1370 > URL: https://issues.apache.org/jira/browse/SOLR-1370 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1370.patch > > > Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if > CharFilters are defined for the fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-1370: Assignee: Koji Sekiguchi > call CharFilters in FieldAnalysisRequestHandler > --- > > Key: SOLR-1370 > URL: https://issues.apache.org/jira/browse/SOLR-1370 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1370.patch > > > Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if > CharFilters are defined for the fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-1370: - Attachment: SOLR-1370.patch The fix and test code. > call CharFilters in FieldAnalysisRequestHandler > --- > > Key: SOLR-1370 > URL: https://issues.apache.org/jira/browse/SOLR-1370 > Project: Solr > Issue Type: Bug > Components: Analysis >Affects Versions: 1.4 >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: SOLR-1370.patch > > > Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if > CharFilters are defined for the fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler
call CharFilters in FieldAnalysisRequestHandler --- Key: SOLR-1370 URL: https://issues.apache.org/jira/browse/SOLR-1370 Project: Solr Issue Type: Bug Components: Analysis Affects Versions: 1.4 Reporter: Koji Sekiguchi Priority: Minor Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if CharFilters are defined for the fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1347) StatsComponent throws error for single-valued numeric fields.
[ https://issues.apache.org/jira/browse/SOLR-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi resolved SOLR-1347. -- Resolution: Invalid > StatsComponent throws error for single-valued numeric fields. > - > > Key: SOLR-1347 > URL: https://issues.apache.org/jira/browse/SOLR-1347 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.4 > Environment: MAC OSX >Reporter: sumit biyani > > Hi , > Search component is throwing incorrect error while running below query on > sample data. > http://localhost:8983/solr/select?q=*:*&rows=0&indent=true&stats=on&stats.field=price > HTTP ERROR: 400 > Stats are valid for single valued numeric values. not: > price[float{class=org.apache.solr.schema.TrieFloatField,analyzer=org.apache.solr.analysis.TokenizerChain,args={precisionStep=0, > positionIncrementGap=0, omitNorms=true}}] > Here , price is single valued float type. > > I also tried to change type to "pfloat" , but it gave error on parseDouble > method. > This is run using Solr nightly build @ 07-Aug > Please check and let me know if I am missing something here. > Thanks & Regards, > Sumit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.