[jira] Updated: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true
[ https://issues.apache.org/jira/browse/SOLR-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-1826: - Attachment: SOLR-1826.txt updated the patch because I borked the indentation > highlighting breaks when using WordDelimiterFilter and setting > termOffsets=true > --- > > Key: SOLR-1826 > URL: https://issues.apache.org/jira/browse/SOLR-1826 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 1.4 >Reporter: Stefan Oestreicher > Attachments: SOLR-1826.txt, SOLR-1826.txt, SOLR-1826.txt > > > When using the WordDelimiterFilter and setting termOffsets to true the > highlighting breaks in certain cases. This did not happen in the 1.3 release. > For example, if I index the term "PowerShot.com" and search for {{pow*}} the > highlighting snippet contains {{PowerPowerShot.com}}. > I will attach a patch which adds tests to the highlighter unittest to > demonstrate the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true
[ https://issues.apache.org/jira/browse/SOLR-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-1826: - Attachment: SOLR-1826.txt I just realised that the field type definition in my patch is unnecessary. I removed it and set the termOffsets attribute directly for the field. > highlighting breaks when using WordDelimiterFilter and setting > termOffsets=true > --- > > Key: SOLR-1826 > URL: https://issues.apache.org/jira/browse/SOLR-1826 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 1.4 >Reporter: Stefan Oestreicher > Attachments: SOLR-1826.txt, SOLR-1826.txt > > > When using the WordDelimiterFilter and setting termOffsets to true the > highlighting breaks in certain cases. This did not happen in the 1.3 release. > For example, if I index the term "PowerShot.com" and search for {{pow*}} the > highlighting snippet contains {{PowerPowerShot.com}}. > I will attach a patch which adds tests to the highlighter unittest to > demonstrate the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true
[ https://issues.apache.org/jira/browse/SOLR-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-1826: - Attachment: SOLR-1826.txt attached patch demonstrates the problem > highlighting breaks when using WordDelimiterFilter and setting > termOffsets=true > --- > > Key: SOLR-1826 > URL: https://issues.apache.org/jira/browse/SOLR-1826 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 1.4 >Reporter: Stefan Oestreicher > Attachments: SOLR-1826.txt > > > When using the WordDelimiterFilter and setting termOffsets to true the > highlighting breaks in certain cases. This did not happen in the 1.3 release. > For example, if I index the term "PowerShot.com" and search for {{pow*}} the > highlighting snippet contains {{PowerPowerShot.com}}. > I will attach a patch which adds tests to the highlighter unittest to > demonstrate the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true
highlighting breaks when using WordDelimiterFilter and setting termOffsets=true --- Key: SOLR-1826 URL: https://issues.apache.org/jira/browse/SOLR-1826 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4 Reporter: Stefan Oestreicher Attachments: SOLR-1826.txt When using the WordDelimiterFilter and setting termOffsets to true the highlighting breaks in certain cases. This did not happen in the 1.3 release. For example, if I index the term "PowerShot.com" and search for {{pow*}} the highlighting snippet contains {{PowerPowerShot.com}}. I will attach a patch which adds tests to the highlighter unittest to demonstrate the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-806) improve m2-deploy tasks authentication support
[ https://issues.apache.org/jira/browse/SOLR-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638248#action_12638248 ] Stefan Oestreicher commented on SOLR-806: - Well ... I don't :) When I opened this issue I wasn't aware that snapshots are available. I guess this issue can be closed then. Sorry. > improve m2-deploy tasks authentication support > -- > > Key: SOLR-806 > URL: https://issues.apache.org/jira/browse/SOLR-806 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Trivial > > The m2-deploy task uses the authentication element with the username and > privateKey attribute to set the user credentials. Unfortunately the > privateKey attribute is only applicable for ssh connections. > Quote from http://maven.apache.org/ant-tasks.html: > bq. It accepts the attributes username, password, and for SSH based > repositories privateKey and passphrase. > Therefore authentication fails for non-ssh connections. I worked around that > by using the "password" attribute instead of privateKey. However I'd prefer > not having to modify the build file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-806) improve m2-deploy tasks authentication support
improve m2-deploy tasks authentication support -- Key: SOLR-806 URL: https://issues.apache.org/jira/browse/SOLR-806 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Stefan Oestreicher Priority: Trivial The m2-deploy task uses the authentication element with the username and privateKey attribute to set the user credentials. Unfortunately the privateKey attribute is only applicable for ssh connections. Quote from http://maven.apache.org/ant-tasks.html: bq. It accepts the attributes username, password, and for SSH based repositories privateKey and passphrase. Therefore authentication fails for non-ssh connections. I worked around that by using the "password" attribute instead of privateKey. However I'd prefer not having to modify the build file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-751) WordDelimiterFilter doesn't adjust startOffset
[ https://issues.apache.org/jira/browse/SOLR-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-751: Attachment: SOLR-751.patch I found the problem and updated the patch with more tests and a fix. Would be great if this could make into 1.3 since highlighting breaks because of it. > WordDelimiterFilter doesn't adjust startOffset > -- > > Key: SOLR-751 > URL: https://issues.apache.org/jira/browse/SOLR-751 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3, 1.4 >Reporter: Stefan Oestreicher > Attachments: SOLR-751.patch, SOLR-751.patch > > > If the first character of a token gets stripped the startOffset of that token > is not adjusted. With the last character it behaves as expected. I'll attach > a patch for the TestWordDelimiterFilter testcase which reproduces that issue > shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-751) WordDelimiterFilter doesn't adjust startOffset
WordDelimiterFilter doesn't adjust startOffset -- Key: SOLR-751 URL: https://issues.apache.org/jira/browse/SOLR-751 Project: Solr Issue Type: Bug Affects Versions: 1.3, 1.4 Reporter: Stefan Oestreicher Attachments: SOLR-751.patch If the first character of a token gets stripped the startOffset of that token is not adjusted. With the last character it behaves as expected. I'll attach a patch for the TestWordDelimiterFilter testcase which reproduces that issue shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-751) WordDelimiterFilter doesn't adjust startOffset
[ https://issues.apache.org/jira/browse/SOLR-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-751: Attachment: SOLR-751.patch > WordDelimiterFilter doesn't adjust startOffset > -- > > Key: SOLR-751 > URL: https://issues.apache.org/jira/browse/SOLR-751 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3, 1.4 >Reporter: Stefan Oestreicher > Attachments: SOLR-751.patch > > > If the first character of a token gets stripped the startOffset of that token > is not adjusted. With the last character it behaves as expected. I'll attach > a patch for the TestWordDelimiterFilter testcase which reproduces that issue > shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-704) improve NumberFormatTransformers handling of invalid input
[ https://issues.apache.org/jira/browse/SOLR-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-704: Attachment: SOLR-704.patch I updated the patch according to the latest changes in trunk. This is a diff against trunk 686186. I also added another test to demonstrate that "124,,789" or "123,45" parse without failure (with and without the fix). > improve NumberFormatTransformers handling of invalid input > -- > > Key: SOLR-704 > URL: https://issues.apache.org/jira/browse/SOLR-704 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: SOLR-704.patch, SOLR-704.patch > > > As described in > http://www.ibm.com/developerworks/java/library/j-numberformat/ instances of > the NumberFormat class don't throw ParseExceptions in all cases of invalid > input. This affects the > org.apache.solr.handler.dataimport.NumberFormatTransformer class which relies > on NumberFormat. I will attach a patch to demonstrate and fix this issue > shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-704) improve NumberFormatTransformers handling of invalid input
[ https://issues.apache.org/jira/browse/SOLR-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-704: Attachment: SOLR-704.patch This is a diff against r686094 > improve NumberFormatTransformers handling of invalid input > -- > > Key: SOLR-704 > URL: https://issues.apache.org/jira/browse/SOLR-704 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: SOLR-704.patch > > > As described in > http://www.ibm.com/developerworks/java/library/j-numberformat/ instances of > the NumberFormat class don't throw ParseExceptions in all cases of invalid > input. This affects the > org.apache.solr.handler.dataimport.NumberFormatTransformer class which relies > on NumberFormat. I will attach a patch to demonstrate and fix this issue > shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-704) improve NumberFormatTransformers handling of invalid input
improve NumberFormatTransformers handling of invalid input -- Key: SOLR-704 URL: https://issues.apache.org/jira/browse/SOLR-704 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Stefan Oestreicher Priority: Minor As described in http://www.ibm.com/developerworks/java/library/j-numberformat/ instances of the NumberFormat class don't throw ParseExceptions in all cases of invalid input. This affects the org.apache.solr.handler.dataimport.NumberFormatTransformer class which relies on NumberFormat. I will attach a patch to demonstrate and fix this issue shortly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-701) DateField implicitly uses default locale to format milliseconds
[ https://issues.apache.org/jira/browse/SOLR-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-701: Description: The ISO8601CanonicalDateFormat class which is defined inside the DateField class doesn't specify the US locale when creating the formatter used for milliseconds. Because of that the following unit tests fail when run with another locale than english: org.apache.solr.schema.LegacyDateFieldTest org.apache.solr.schema.DateFieldTest The attached patch resolves this issue. was: The following unit tests fail if they're run with a locale using "," as decimal separator (e.g. german): org.apache.solr.schema.LegacyDateFieldTest org.apache.solr.schema.DateFieldTest org.apache.solr.handler.dataimport.TestNumberFormatTransformer The DateField tests fail because the expected decimal format uses "." as separator but the actual result is formatted using the "," separator. This happens because the ISO8601CanonicalDateFormat class which is defined inside the DateField class doesn't specify the US locale when creating the formatter used for milliseconds. Patch: schema.DateField-locale.patch.txt see SOLR-700 for a resolution to the TestNumberFormatTransformer failure. Affects Version/s: (was: 1.3) Summary: DateField implicitly uses default locale to format milliseconds (was: test failures because system locale is used) clarify subject and description > DateField implicitly uses default locale to format milliseconds > --- > > Key: SOLR-701 > URL: https://issues.apache.org/jira/browse/SOLR-701 > Project: Solr > Issue Type: Bug >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: schema.DateField-locale.patch.txt > > > The ISO8601CanonicalDateFormat class which is defined inside the DateField > class doesn't specify the US locale when creating the formatter used for > milliseconds. Because of that the following unit tests fail when run with > another locale than english: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > The attached patch resolves this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-700) NumberFormatTransformer should have configurable locales
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622709#action_12622709 ] Stefan Oestreicher commented on SOLR-700: - bq. We don't need to be paranoid about this [...] Ok. bq. Reading through the docs, another area I'm becoming concerned about is partial parsing of the string I see. Interestingly this behaviour of accepting invalid strings seems to only apply to the generic formatter. I wrote some tests and one of them uses a percent formatter (I also tested currency) and that throws a ParseException. Fixing that seems to be even easier than shown in the example :) I already have a patch ready but I guess I should open another issue for this (seems pretty seperate to me)? > NumberFormatTransformer should have configurable locales > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.3 > > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt, SOLR-700.patch > > > NumberFormatTransformer uses a NumberFormatter which relies on the system > locale. This makes it impossible to use NumberFormatTransformer with data > whose locale does not match the system locale. > TestNumberFormatTransformer fails on some locales for similar reasons because > the grouping symbol differs in different locales. > This issue adds a locale attribute for NumberFormatTransformer which allows > the user to specify the locale which should be used for formatting. The > locale must be specified as land-country e.g. en-US > {code:xml} > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-700) NumberFormatTransformer should have configurable locales
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622603#action_12622603 ] Stefan Oestreicher commented on SOLR-700: - Nice. I also stumbled upon the fact that the java docs don't say anything about invalid locales. I quickly tested that and interestingly the DateFormatSymbols instance returned "." as decimal and "," as grouping separator. I would've expected the default locale (which is de-AT in my case) to be used but obviously it did not. In any case the getISO3Country method of the Locale class (among others) throws an exception if it's invoked on an invalid locale. Maybe it would be best to check all available locales explicitly instead of relying on the regex, possibly by constructing a static HashMap of them?! > NumberFormatTransformer should have configurable locales > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.3 > > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt, SOLR-700.patch > > > NumberFormatTransformer uses a NumberFormatter which relies on the system > locale. This makes it impossible to use NumberFormatTransformer with data > whose locale does not match the system locale. > TestNumberFormatTransformer fails on some locales for similar reasons because > the grouping symbol differs in different locales. > This issue adds a locale attribute for NumberFormatTransformer which allows > the user to specify the locale which should be used for formatting. The > locale must be specified as land-country e.g. en-US > {code:xml} > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622564#action_12622564 ] Stefan Oestreicher commented on SOLR-700: - done: SOLR-701. > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-701) test failures because system locale is used
[ https://issues.apache.org/jira/browse/SOLR-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-701: Attachment: schema.DateField-locale.patch.txt > test failures because system locale is used > --- > > Key: SOLR-701 > URL: https://issues.apache.org/jira/browse/SOLR-701 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a locale using "," as > decimal separator (e.g. german): > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > see SOLR-700 for a resolution to the TestNumberFormatTransformer failure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-701) test failures because system locale is used
test failures because system locale is used --- Key: SOLR-701 URL: https://issues.apache.org/jira/browse/SOLR-701 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Stefan Oestreicher Priority: Minor Attachments: schema.DateField-locale.patch.txt The following unit tests fail if they're run with a locale using "," as decimal separator (e.g. german): org.apache.solr.schema.LegacyDateFieldTest org.apache.solr.schema.DateFieldTest org.apache.solr.handler.dataimport.TestNumberFormatTransformer The DateField tests fail because the expected decimal format uses "." as separator but the actual result is formatted using the "," separator. This happens because the ISO8601CanonicalDateFormat class which is defined inside the DateField class doesn't specify the US locale when creating the formatter used for milliseconds. Patch: schema.DateField-locale.patch.txt see SOLR-700 for a resolution to the TestNumberFormatTransformer failure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: handler.dataimport.NumberFormatTransformer-locale.patch..txt setting the locale to the default locale just to have it changed afterwards if the parameter exists is not very nice I guess, so I changed that as well. Furthermore the user should probably be notified if he supplies an invalid locale, but I'm unsure how to proceed in this case, so I leave this for now as it is. > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: handler.dataimport.NumberFormatTransformer-locale.patch..txt minor update using the LOCALE constant instead of literal String. > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: handler.dataimport.NumberFormatTransformer-locale.patch..txt I see, that's certainly more flexible. I modified the patch. > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: handler.dataimport.NumberFormatTransformer-locale.patch..txt based on Shalins suggestion to add an attribute for the locale I prepared another patch implementing those changes. > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.NumberFormatTransformer-locale.patch..txt, > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: schema.DateField-locale.patch.txt > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-700) test failures because of wrong/unexpected locale
[ https://issues.apache.org/jira/browse/SOLR-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-700: Attachment: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt > test failures because of wrong/unexpected locale > > > Key: SOLR-700 > URL: https://issues.apache.org/jira/browse/SOLR-700 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Minor > Attachments: > handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, > schema.DateField-locale.patch.txt > > > The following unit tests fail if they're run with a german locale: > org.apache.solr.schema.LegacyDateFieldTest > org.apache.solr.schema.DateFieldTest > org.apache.solr.handler.dataimport.TestNumberFormatTransformer > The DateField tests fail because the expected decimal format uses "." as > separator but the actual result is formatted using the "," separator. > This happens because the ISO8601CanonicalDateFormat class which is defined > inside the DateField class doesn't specify the US locale when creating the > formatter used for milliseconds. > Patch: schema.DateField-locale.patch.txt > TestNumberFormatTransformer failed because > handler.dataimport.NumberFormatTransformer uses a NumberFormatter which > relies on the system locale. But I think in this case it's intentional, so I > modified the test case to use the grouping separator of the system locale. > Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-700) test failures because of wrong/unexpected locale
test failures because of wrong/unexpected locale Key: SOLR-700 URL: https://issues.apache.org/jira/browse/SOLR-700 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.3 Reporter: Stefan Oestreicher Priority: Minor Attachments: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt, schema.DateField-locale.patch.txt The following unit tests fail if they're run with a german locale: org.apache.solr.schema.LegacyDateFieldTest org.apache.solr.schema.DateFieldTest org.apache.solr.handler.dataimport.TestNumberFormatTransformer The DateField tests fail because the expected decimal format uses "." as separator but the actual result is formatted using the "," separator. This happens because the ISO8601CanonicalDateFormat class which is defined inside the DateField class doesn't specify the US locale when creating the formatter used for milliseconds. Patch: schema.DateField-locale.patch.txt TestNumberFormatTransformer failed because handler.dataimport.NumberFormatTransformer uses a NumberFormatter which relies on the system locale. But I think in this case it's intentional, so I modified the test case to use the grouping separator of the system locale. Patch: handler.dataimport.TestNumberFormatTransformer-locale.patch.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-606) spellcheck.colate doesn't handle multiple tokens properly
[ https://issues.apache.org/jira/browse/SOLR-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622508#action_12622508 ] dlx edited comment on SOLR-606 at 8/14/08 3:40 AM: -- I recently ran into this exact issue and I found the problem. The collation is created by replacing the misspelled tokens with the suggestions using a StringBuilder: {noformat} for (Iterator> bestIter = best.entrySet().iterator(); bestIter.hasNext();) { Map.Entry entry = bestIter.next(); Token tok = entry.getKey(); collation.replace(tok.startOffset(), tok.endOffset(), entry.getValue()); } {noformat} As you can see it's just replacing the relevant tokens in the original query. However, if the length of a suggestion doesn't equal the length of the original token, all offsets used after that replacement are no longer valid thus randomly yielding incorrect results. I fixed that by keeping track of that difference and adding it to the token offsets. For this to work I had to change the HashMap to a LinkedHashMap since this solution depends on the iteration order of the Tokens to correspond to their occurrence in the string. I attached a patch reflecting those changes: handler.component.SpellCheckComponent-collate-patch.txt was (Author: dlx): I recently ran into this exact issue and I found the problem. The collation is created by replacing the misspelled tokens with the suggestions using a StringBuilder: {noformat} for (Iterator> bestIter = best.entrySet().iterator(); bestIter.hasNext();) { Map.Entry entry = bestIter.next(); Token tok = entry.getKey(); collation.replace(tok.startOffset(), tok.endOffset(), entry.getValue()); } {noformat} As you can see it's just replacing the relevant tokens in the original query. However, if the length of a suggestion doesn't equal the length of the original token, all offsets used after that replacement are no longer valid thus randomly yielding incorrect results. I fixed that by keeping track of that difference and adding it to the token offsets. For this to work I had to change the HashMap to a LinkedHashMap since this solution depends on the iteration order of the Tokens to correspond to their occurrence in the string. > spellcheck.colate doesn't handle multiple tokens properly > - > > Key: SOLR-606 > URL: https://issues.apache.org/jira/browse/SOLR-606 > Project: Solr > Issue Type: Bug > Components: spellchecker >Affects Versions: 1.3 > Environment: tomcat >Reporter: Geoffrey Young >Assignee: Grant Ingersoll >Priority: Minor > Attachments: handler.component.SpellCheckComponent-collate-patch.txt, > SOLR-606.patch > > > originally posted as part of SOLR-572: > > https://issues.apache.org/jira/browse/SOLR-572?focusedCommentId=12608487#action_12608487 > the new spellcheck.collate feature seems to exhibit some strange behaviors > when handed a query with multiple tokens. > {noformat} > { > "responseHeader":{ > "params":{ > "q":"redbull air show"}}, > "spellcheck":{ >"suggestions":[ > "redbull",[ >"suggestion",["redbelly"]], > "show",[ >"suggestion",["shot"]], > "collation","redbelly airshotw"]}} > {noformat} > in this case, note the fields are incorrectly concatenated (no space between > tokens, left over 'w' from input string) > {noformat} > { > "responseHeader":{ > "params":{ > "q":"redbull air show", > "spellcheck.q":"redbull air show"}}, > "spellcheck":{ > "suggestions":[ > "redbull air show",[ >"suggestion",["redbull singers"]], > "collation","redbull singersredbull air show"]}} > {noformat} > this is slightly different - the suggestions are still concatenated without a > space, but the collation is way off. > --Geoff -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-606) spellcheck.colate doesn't handle multiple tokens properly
[ https://issues.apache.org/jira/browse/SOLR-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Oestreicher updated SOLR-606: Attachment: handler.component.SpellCheckComponent-collate-patch.txt I recently ran into this exact issue and I found the problem. The collation is created by replacing the misspelled tokens with the suggestions using a StringBuilder: {noformat} for (Iterator> bestIter = best.entrySet().iterator(); bestIter.hasNext();) { Map.Entry entry = bestIter.next(); Token tok = entry.getKey(); collation.replace(tok.startOffset(), tok.endOffset(), entry.getValue()); } {noformat} As you can see it's just replacing the relevant tokens in the original query. However, if the length of a suggestion doesn't equal the length of the original token, all offsets used after that replacement are no longer valid thus randomly yielding incorrect results. I fixed that by keeping track of that difference and adding it to the token offsets. For this to work I had to change the HashMap to a LinkedHashMap since this solution depends on the iteration order of the Tokens to correspond to their occurrence in the string. > spellcheck.colate doesn't handle multiple tokens properly > - > > Key: SOLR-606 > URL: https://issues.apache.org/jira/browse/SOLR-606 > Project: Solr > Issue Type: Bug > Components: spellchecker >Affects Versions: 1.3 > Environment: tomcat >Reporter: Geoffrey Young >Assignee: Grant Ingersoll >Priority: Minor > Attachments: handler.component.SpellCheckComponent-collate-patch.txt, > SOLR-606.patch > > > originally posted as part of SOLR-572: > > https://issues.apache.org/jira/browse/SOLR-572?focusedCommentId=12608487#action_12608487 > the new spellcheck.collate feature seems to exhibit some strange behaviors > when handed a query with multiple tokens. > {noformat} > { > "responseHeader":{ > "params":{ > "q":"redbull air show"}}, > "spellcheck":{ >"suggestions":[ > "redbull",[ >"suggestion",["redbelly"]], > "show",[ >"suggestion",["shot"]], > "collation","redbelly airshotw"]}} > {noformat} > in this case, note the fields are incorrectly concatenated (no space between > tokens, left over 'w' from input string) > {noformat} > { > "responseHeader":{ > "params":{ > "q":"redbull air show", > "spellcheck.q":"redbull air show"}}, > "spellcheck":{ > "suggestions":[ > "redbull air show",[ >"suggestion",["redbull singers"]], > "collation","redbull singersredbull air show"]}} > {noformat} > this is slightly different - the suggestions are still concatenated without a > space, but the collation is way off. > --Geoff -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-694) error message is misleading when there is no queryconverter defined
[ https://issues.apache.org/jira/browse/SOLR-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622159#action_12622159 ] dlx edited comment on SOLR-694 at 8/13/08 3:35 AM: -- here's a patch {noformat} ### Eclipse Workspace Patch 1.0 #P solr Index: src/java/org/apache/solr/handler/component/SpellCheckComponent.java === --- src/java/org/apache/solr/handler/component/SpellCheckComponent.java (revision 685498) +++ src/java/org/apache/solr/handler/component/SpellCheckComponent.java (working copy) @@ -54,6 +54,7 @@ import org.apache.solr.spelling.IndexBasedSpellChecker; import org.apache.solr.spelling.QueryConverter; import org.apache.solr.spelling.SolrSpellChecker; +import org.apache.solr.spelling.SpellingQueryConverter; import org.apache.solr.spelling.SpellingResult; import org.apache.solr.util.RefCounted; import org.apache.solr.util.plugin.NamedListPluginLoader; @@ -290,6 +291,13 @@ new NamedListPluginLoader("[solrconfig.xml] " + xpath, queryConverters); loader.load(solrConfig.getResourceLoader(), nodes); + + //ensure that there is at least one query converter defined + if (queryConverters.size() == 0) { + LOG.warning("no queryConverter defined, using default converter"); + queryConverters.put("queryConverter", new SpellingQueryConverter()); + } + //there should only be one if (queryConverters.size() == 1) { queryConverter = queryConverters.values().iterator().next(); {noformat} was (Author: dlx): here's a patch ### Eclipse Workspace Patch 1.0 #P solr Index: src/java/org/apache/solr/handler/component/SpellCheckComponent.java === --- src/java/org/apache/solr/handler/component/SpellCheckComponent.java (revision 685498) +++ src/java/org/apache/solr/handler/component/SpellCheckComponent.java (working copy) @@ -54,6 +54,7 @@ import org.apache.solr.spelling.IndexBasedSpellChecker; import org.apache.solr.spelling.QueryConverter; import org.apache.solr.spelling.SolrSpellChecker; +import org.apache.solr.spelling.SpellingQueryConverter; import org.apache.solr.spelling.SpellingResult; import org.apache.solr.util.RefCounted; import org.apache.solr.util.plugin.NamedListPluginLoader; @@ -290,6 +291,13 @@ new NamedListPluginLoader("[solrconfig.xml] " + xpath, queryConverters); loader.load(solrConfig.getResourceLoader(), nodes); + + //ensure that there is at least one query converter defined + if (queryConverters.size() == 0) { + LOG.warning("no queryConverter defined, using default converter"); + queryConverters.put("queryConverter", new SpellingQueryConverter()); + } + //there should only be one if (queryConverters.size() == 1) { queryConverter = queryConverters.values().iterator().next(); > error message is misleading when there is no queryconverter defined > --- > > Key: SOLR-694 > URL: https://issues.apache.org/jira/browse/SOLR-694 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Trivial > > the relevant code (org.apache.solr.handler.component.SpellCheckComponent) > only checks if the size of the queryConverters map equals 1 and throws a > RuntimeException with the message "One and only one queryConverter may be > defined" if it does not. > But if there are no query converters defined this message is quite misleading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-694) error message is misleading when there is no queryconverter defined
[ https://issues.apache.org/jira/browse/SOLR-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622159#action_12622159 ] Stefan Oestreicher commented on SOLR-694: - here's a patch ### Eclipse Workspace Patch 1.0 #P solr Index: src/java/org/apache/solr/handler/component/SpellCheckComponent.java === --- src/java/org/apache/solr/handler/component/SpellCheckComponent.java (revision 685498) +++ src/java/org/apache/solr/handler/component/SpellCheckComponent.java (working copy) @@ -54,6 +54,7 @@ import org.apache.solr.spelling.IndexBasedSpellChecker; import org.apache.solr.spelling.QueryConverter; import org.apache.solr.spelling.SolrSpellChecker; +import org.apache.solr.spelling.SpellingQueryConverter; import org.apache.solr.spelling.SpellingResult; import org.apache.solr.util.RefCounted; import org.apache.solr.util.plugin.NamedListPluginLoader; @@ -290,6 +291,13 @@ new NamedListPluginLoader("[solrconfig.xml] " + xpath, queryConverters); loader.load(solrConfig.getResourceLoader(), nodes); + + //ensure that there is at least one query converter defined + if (queryConverters.size() == 0) { + LOG.warning("no queryConverter defined, using default converter"); + queryConverters.put("queryConverter", new SpellingQueryConverter()); + } + //there should only be one if (queryConverters.size() == 1) { queryConverter = queryConverters.values().iterator().next(); > error message is misleading when there is no queryconverter defined > --- > > Key: SOLR-694 > URL: https://issues.apache.org/jira/browse/SOLR-694 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Affects Versions: 1.3 >Reporter: Stefan Oestreicher >Priority: Trivial > > the relevant code (org.apache.solr.handler.component.SpellCheckComponent) > only checks if the size of the queryConverters map equals 1 and throws a > RuntimeException with the message "One and only one queryConverter may be > defined" if it does not. > But if there are no query converters defined this message is quite misleading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-694) error message is misleading when there is no queryconverter defined
error message is misleading when there is no queryconverter defined --- Key: SOLR-694 URL: https://issues.apache.org/jira/browse/SOLR-694 Project: Solr Issue Type: Improvement Components: spellchecker Affects Versions: 1.3 Reporter: Stefan Oestreicher Priority: Trivial the relevant code (org.apache.solr.handler.component.SpellCheckComponent) only checks if the size of the queryConverters map equals 1 and throws a RuntimeException with the message "One and only one queryConverter may be defined" if it does not. But if there are no query converters defined this message is quite misleading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.