[jira] Resolved: (SOLR-1010) Relative instanceDir is evaluated relative to current working directory
[ https://issues.apache.org/jira/browse/SOLR-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1010. - Resolution: Invalid > Relative instanceDir is evaluated relative to current working directory > --- > > Key: SOLR-1010 > URL: https://issues.apache.org/jira/browse/SOLR-1010 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.4 > > > If a relative instanceDir is provided in solr.xml, it should be evaluated > relative to solr.home instead of the current working directory. > I guess people work around this bug right now by using absolute paths for > instanceDir. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1029) Standardize Evaluator parameter handling
[ https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1029. - Resolution: Fixed Committed revision 747291. Thanks Noble! > Standardize Evaluator parameter handling > > > Key: SOLR-1029 > URL: https://issues.apache.org/jira/browse/SOLR-1029 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1029.patch, SOLR-1029.patch > > > The evaluator parameter syntax is not well defined and inconsistent in some > places. This issue aims to standardize on the following: > # String parameters should be enclosed in a single quote > # Variables can be written as-is > # Numbers (integers or floats) can be written as-is > We will enforce this among the existing evaluators and provide helper > functions for this parsing which future implementations can use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1029) Standardize Evaluator parameter handling
[ https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676207#action_12676207 ] Shalin Shekhar Mangar commented on SOLR-1029: - I added a note on this for users upgrading from 1.3 > Standardize Evaluator parameter handling > > > Key: SOLR-1029 > URL: https://issues.apache.org/jira/browse/SOLR-1029 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1029.patch, SOLR-1029.patch > > > The evaluator parameter syntax is not well defined and inconsistent in some > places. This issue aims to standardize on the following: > # String parameters should be enclosed in a single quote > # Variables can be written as-is > # Numbers (integers or floats) can be written as-is > We will enforce this among the existing evaluators and provide helper > functions for this parsing which future implementations can use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1029) Standardize Evaluator parameter handling
[ https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-1029: Attachment: SOLR-1029.patch # Made the behavior backwards-compatible with old syntax. # Added warning if old syntax is used. I'll commit this shortly. > Standardize Evaluator parameter handling > > > Key: SOLR-1029 > URL: https://issues.apache.org/jira/browse/SOLR-1029 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-1029.patch, SOLR-1029.patch > > > The evaluator parameter syntax is not well defined and inconsistent in some > places. This issue aims to standardize on the following: > # String parameters should be enclosed in a single quote > # Variables can be written as-is > # Numbers (integers or floats) can be written as-is > We will enforce this among the existing evaluators and provide helper > functions for this parsing which future implementations can use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187
[ https://issues.apache.org/jira/browse/SOLR-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1035. - Resolution: Fixed Fix Version/s: 1.4 Assignee: Shalin Shekhar Mangar Committed revision 747284. Thanks David! > catch RuntimeException instead of NFE in FieldType line 187 > --- > > Key: SOLR-1035 > URL: https://issues.apache.org/jira/browse/SOLR-1035 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.4 >Reporter: David Smiley >Assignee: Shalin Shekhar Mangar >Priority: Trivial > Fix For: 1.4 > > Original Estimate: 0.08h > Remaining Estimate: 0.08h > > This is a super-easy request for a Solr commiter to do. Look at line 187 of > FieldType. It catches NumberFormatException (a subtype of RuntimeException) > when calling toInternal() in order to throw a SolrException that has useful > information -- namely the field attempting to be set and its value. The > caught exception should be declared to be very broudly at RuntimeException > since problems other than NumberFormatException might occur. An example is > SolrException thrown by DateField when parsing fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187
catch RuntimeException instead of NFE in FieldType line 187 --- Key: SOLR-1035 URL: https://issues.apache.org/jira/browse/SOLR-1035 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: David Smiley Priority: Trivial This is a super-easy request for a Solr commiter to do. Look at line 187 of FieldType. It catches NumberFormatException (a subtype of RuntimeException) when calling toInternal() in order to throw a SolrException that has useful information -- namely the field attempting to be set and its value. The caught exception should be declared to be very broudly at RuntimeException since problems other than NumberFormatException might occur. An example is SolrException thrown by DateField when parsing fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Improvement Request: exception setting field exception
Even if it is a lightweight task (unless it is a comment that is to be changed) it is wise to open a JIRA issue. On Tue, Feb 24, 2009 at 10:30 AM, David Smiley @MITRE.org wrote: > > This is a super-easy request for a Solr commiter to do. Look at line 187 of > FieldType. It catches NumberFormatException (a subtype of RuntimeException) > when calling toInternal() in order to throw a SolrException that has useful > information -- namely the field attempting to be set and its value. The > caught exception should be declared to be very broudly at RuntimeException > since problems other than NumberFormatException might occur. An example is > SolrException thrown by DateField when parsing fails. > > I was about to create a JIRA issue for this but thought it might be more > light-weight to ask on this list. If a commiter would rather I submit to > JIRA any way then let me know. > > ~ David Smiley > -- > View this message in context: > http://www.nabble.com/Improvement-Request%3A-exception-setting-field-exception-tp22175880p22175880.html > Sent from the Solr - Dev mailing list archive at Nabble.com. > > -- --Noble Paul
Improvement Request: exception setting field exception
This is a super-easy request for a Solr commiter to do. Look at line 187 of FieldType. It catches NumberFormatException (a subtype of RuntimeException) when calling toInternal() in order to throw a SolrException that has useful information -- namely the field attempting to be set and its value. The caught exception should be declared to be very broudly at RuntimeException since problems other than NumberFormatException might occur. An example is SolrException thrown by DateField when parsing fails. I was about to create a JIRA issue for this but thought it might be more light-weight to ask on this list. If a commiter would rather I submit to JIRA any way then let me know. ~ David Smiley -- View this message in context: http://www.nabble.com/Improvement-Request%3A-exception-setting-field-exception-tp22175880p22175880.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1033: - Attachment: SOLR-1033.patch the complete patch. XPathEntityprocessor needed some rework > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch, > SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676165#action_12676165 ] Noble Paul commented on SOLR-1033: -- bq.You say "you are constructing the regex replacements themselves with templates" by which you mean using the ${XXX} syntax and not the output from a templatetransformer? when I said 'template' I mean any string with ${xxx} content. the 'template' attribute is the only value Templatetransformer is interested in. Any attribute value in DIH is potentially a template .Some are honoured and some are not. I hope we can consistently make it work across all. > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1034) ClientUtils.escapeQuery should escape ;
[ https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-1034. - Resolution: Fixed Committed revision 747260. thanks brian > ClientUtils.escapeQuery should escape ; > --- > > Key: SOLR-1034 > URL: https://issues.apache.org/jira/browse/SOLR-1034 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.4 > Environment: all >Reporter: Brian Whitman >Assignee: Ryan McKinley >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-ESCAPE.patch > > > The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some > unexpected lexical errors when parsing query strings with ; in them. > See > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1034) ClientUtils.escapeQuery should escape ;
[ https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley reassigned SOLR-1034: --- Assignee: Ryan McKinley > ClientUtils.escapeQuery should escape ; > --- > > Key: SOLR-1034 > URL: https://issues.apache.org/jira/browse/SOLR-1034 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.4 > Environment: all >Reporter: Brian Whitman >Assignee: Ryan McKinley >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-ESCAPE.patch > > > The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some > unexpected lexical errors when parsing query strings with ; in them. > See > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1034) ClientUtils.escapeQuery should escape ;
[ https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Whitman updated SOLR-1034: Attachment: SOLR-ESCAPE.patch Patch to escape ; > ClientUtils.escapeQuery should escape ; > --- > > Key: SOLR-1034 > URL: https://issues.apache.org/jira/browse/SOLR-1034 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.4 > Environment: all >Reporter: Brian Whitman >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-ESCAPE.patch > > > The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some > unexpected lexical errors when parsing query strings with ; in them. > See > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1034) ClientUtils.escapeQuery should escape ;
ClientUtils.escapeQuery should escape ; --- Key: SOLR-1034 URL: https://issues.apache.org/jira/browse/SOLR-1034 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: all Reporter: Brian Whitman Priority: Minor Fix For: 1.4 The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some unexpected lexical errors when parsing query strings with ; in them. See http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls
[ https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676136#action_12676136 ] Hoss Man commented on SOLR-844: --- I still think the javadocs should be beefed up a bit... at a minimum the info from "When to use this ?", "How does the Load Balancing happen ?", and "How does it know if a server has come back up ?" sections of the wiki page Noble made should be in the class level javadocs ... some of the "How does it know..." info made it into the javadocs for setAliveCheckInterval, but considering how important that method is there should be a ref to it in the class docs -- and there *definitely* needs to be some explicit mention of the "The ping is done not in a separate thread, it is done in a thread which made a normal request." fact, i would *never* have guessed that looking at the public docs in the class. And as I mentioned before: if there are concerns that this class will be misused (and it certainly seems like there are) then it really needs to contain javadocs explaining when it *doesn't* make sense to use it and some alternative suggestions (if nothing else, a link to wikipedia: [http://en.wikipedia.org/wiki/Load_balancing_(computing)]) > A SolrServer impl to front-end multiple urls > > > Key: SOLR-844 > URL: https://issues.apache.org/jira/browse/SOLR-844 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 1.3 >Reporter: Noble Paul >Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch, > SOLR-844.patch, SOLR-844.patch > > > Currently a {{CommonsHttpSolrServer}} can talk to only one server. This > demands that the user have a LoadBalancer or do the roundrobin on their own. > We must have a {{LBHttpSolrServer}} which must automatically do a > Loadbalancing between multiple hosts. This can be backed by the > {{CommonsHttpSolrServer}} > This can have the following other features > * Automatic failover > * Optionally take in a file /url containing the the urls of servers so that > the server list can be automatically updated by periodically loading the > config > * Support for adding removing servers during runtime > * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, > random etc) > * Pluggable Failover mechanisms -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()
[ https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676120#action_12676120 ] Hoss Man commented on SOLR-1019: bq. The bottom line is that absolute FP comparisons can result in unpredictable behaviors, depending on the underlying implementation something that we want to avoid. The behavior of equality operators on floats is fairly explicit and unambiguous in JLS s15.21 -- if a JVM implementation produces unexpected behavior on these comparisons that's a bug in the JVM implementation. It's not ConstantScorePrefixQuery's job (or the job of any other Query class) to second guess whether the caller considers a boost value of X+epsilon to be "close enough" to a boost value of X to consider them the same query. (FWIW: I agree that equality tests on float/double values tend to indicate ... i just disagree with the assessment that there actually is a mistake in these specific cases) > ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() > - > > Key: SOLR-1019 > URL: https://issues.apache.org/jira/browse/SOLR-1019 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.3 > Environment: Java 6, Tomcat 6 >Reporter: Kay Kay > Attachments: SOLR-1019.patch > > > Absolute FP comparisons are inherently unsafe. Comparisons of FP done under > an epsilon resolution difference. > affects ConstantScorePrefixQuery#toString() and #equals() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675969#action_12675969 ] fergus edited comment on SOLR-1033 at 2/23/09 3:06 PM: - Your comment about modifying the system "to have the current entities rows be available always to all transformers" is good and will produce the fastest most efficient code. But I need to sure we are not using the term "template" twice in different ways. You say "you are constructing the regex replacements themselves with templates" by which you mean using the ${XXX} syntax and not the output from a templatetransformer? Anyway I have backed out my patch and applied yours. Everything seems fine, but I am still testing. Thanks very much. was (Author: fergus): Your comment about modifying the system "to have the current entities rows be available always to all transformers" is good and will produce the fastest most efficient code. But I need to sure we are not using the term "template" twice in different ways. You say "you are constructing the regex replacements themselves with templates" by which you mean using the ${XXX} syntax and not the output from a templatetransformer? Also, is your patch a replacement for mine? > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676041#action_12676041 ] Hoss Man commented on SOLR-799: --- The separation of concerns between schema.xml and solrconfig.xml has always been... * schema.xml: what is the data, what is it's nature, what are it's intrinsic properties? * solrconfig.xml: what can people do with your data, how can they use it? fields, fieldTypes, analyzers, copyFields go in the schema.xml because they are (in theory) intrinsic to the nature of your data regardless of where a given document comes from: * documents should only have one author * categoryName should always be tokenized in a particular way * prices need to sort numericly not lexigraphicallyy * any text indexed in the shortSummary field shoudl also be indexed in the searchableAbstract field * etc... request handlers that dictate how people can use the data are specified in solrconfig.xml -- when searching data request handlers (which may leverage search componets) dictate what a user is allowed to get/see; when modifying an index request handlers (which may leverage update processors) dictate what data is allowed to come from various sources and in what formats. In short: as far as document indexing goes, the options configured in solrconfig.xml specify how to "build up" a Document object from user input, while the options in schema.xml specify how to "tear it down" into it's individual terms and values for indexing. With the near duplicate detection code, it is the schema's job to say which fields can exist in the input documents, including a signature field -- but it is the solrconfig's job to decide how to compute that signature field ... after all: the computation might be different depending on the source of the data (ie: different processor chains could be configured for different request handlers) > Add support for hash based exact/near duplicate document handling > - > > Key: SOLR-799 > URL: https://issues.apache.org/jira/browse/SOLR-799 > Project: Solr > Issue Type: New Feature > Components: update >Reporter: Mark Miller >Assignee: Yonik Seeley >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-799.patch, SOLR-799.patch, SOLR-799.patch, > SOLR-799.patch > > > Hash based duplicate document detection is efficient and allows for blocking > as well as field collapsing. Lets put it into solr. > http://wiki.apache.org/solr/Deduplication -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675969#action_12675969 ] Fergus McMenemie commented on SOLR-1033: Your comment about modifying the system "to have the current entities rows be available always to all transformers" is good and will produce the fastest most efficient code. But I need to sure we are not using the term "template" twice in different ways. You say "you are constructing the regex replacements themselves with templates" by which you mean using the ${XXX} syntax and not the output from a templatetransformer? Also, is your patch a replacement for mine? > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()
[ https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675963#action_12675963 ] Yonik Seeley commented on SOLR-1019: Absolute comparisons aren't good for the result of floating point calculations, but things like boosts are pretty much always set from constants. For the very rare case that these constants are calculated, it should be up to the code doing the calculation to round or normalize if that is desired. In any case, the failure mode in this extremely unlikely event is still graceful - everything still works but the cache hit ratio goes down. Another point: from the Lucene point of view, a boost of 0.0 and a boost of .001 are *very* different - lucene screens out scores <=0.0f (although Solr does not). > ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() > - > > Key: SOLR-1019 > URL: https://issues.apache.org/jira/browse/SOLR-1019 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.3 > Environment: Java 6, Tomcat 6 >Reporter: Kay Kay > Attachments: SOLR-1019.patch > > > Absolute FP comparisons are inherently unsafe. Comparisons of FP done under > an epsilon resolution difference. > affects ConstantScorePrefixQuery#toString() and #equals() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1033: - Attachment: SOLR-1033.patch This should help all other transformers implicitly support templating > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675953#action_12675953 ] noble.paul edited comment on SOLR-1033 at 2/23/09 7:54 AM: --- OK , I see your point. you are constructing the regex replacements themselves with templates. I missed that I am wondering , if the system can be modified to have the current entities rows be available always to all transformers. It can be done as a simple change in the EntityprocessorBase#applyTransformers was (Author: noble.paul): OK , I see your point. you are constructing the regex replacements themselves with templates. I missed that > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675953#action_12675953 ] Noble Paul commented on SOLR-1033: -- OK , I see your point. you are constructing the regex replacements themselves with templates. I missed that > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()
[ https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675945#action_12675945 ] Kay Kay commented on SOLR-1019: --- I believe either of the scenarios in which the fix might be invalid would be related to the value of epsilon . If we keep the value of epsilon small enough - Math.Double.MIN_VALUE - we might get around the issue. The bottom line is that absolute FP comparisons can result in unpredictable behaviors, depending on the underlying implementation something that we want to avoid. > ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() > - > > Key: SOLR-1019 > URL: https://issues.apache.org/jira/browse/SOLR-1019 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.3 > Environment: Java 6, Tomcat 6 >Reporter: Kay Kay > Attachments: SOLR-1019.patch > > > Absolute FP comparisons are inherently unsafe. Comparisons of FP done under > an epsilon resolution difference. > affects ConstantScorePrefixQuery#toString() and #equals() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675902#action_12675902 ] Fergus McMenemie commented on SOLR-1033: Not sure I am following what you say. If I number the different steps in my example entity as follows:- {code} 1 2 3 4 5 {code} We see that column 5 involves a regex which in turn involves columns 3 and 2. Column 3 is itself a regex. We therefore have the output from one regex being used within another regex. So as far as I can see we need the fix made to both the TemplateTransformer and the RegexTransformer. > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: SpellCheckComponent build from search
On Mon, Feb 23, 2009 at 10:47 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > You may only need to add a parameter for the query and override the init > and build method in IndexBasedSpellChecker. Along with a custom Dictionary implementation. -- Regards, Shalin Shekhar Mangar.
[jira] Commented: (SOLR-739) Add support for OmitTf
[ https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675882#action_12675882 ] Mark Miller commented on SOLR-739: -- Good point, will do. - Mark > Add support for OmitTf > -- > > Key: SOLR-739 > URL: https://issues.apache.org/jira/browse/SOLR-739 > Project: Solr > Issue Type: New Feature >Reporter: Mark Miller >Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-739.patch, SOLR-739.patch > > > Allow setting omitTf in the field schema. Default to true for all but text > fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls
[ https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675864#action_12675864 ] Shalin Shekhar Mangar commented on SOLR-844: So what's the verdict on this one? Seems like most concerns were taken care of. Anything else we should do to get this committed? > A SolrServer impl to front-end multiple urls > > > Key: SOLR-844 > URL: https://issues.apache.org/jira/browse/SOLR-844 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 1.3 >Reporter: Noble Paul >Assignee: Shalin Shekhar Mangar > Fix For: 1.4 > > Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch, > SOLR-844.patch, SOLR-844.patch > > > Currently a {{CommonsHttpSolrServer}} can talk to only one server. This > demands that the user have a LoadBalancer or do the roundrobin on their own. > We must have a {{LBHttpSolrServer}} which must automatically do a > Loadbalancing between multiple hosts. This can be backed by the > {{CommonsHttpSolrServer}} > This can have the following other features > * Automatic failover > * Optionally take in a file /url containing the the urls of servers so that > the server list can be automatically updated by periodically loading the > config > * Support for adding removing servers during runtime > * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, > random etc) > * Pluggable Failover mechanisms -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675860#action_12675860 ] Noble Paul commented on SOLR-1033: -- If I am not wrong the output of one transformation in Regextransformer is available in the next transformation , becaus ethe value is added to the same row object . So it should be working if the TemplateTransformer is fixed > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-973) CommonsHttpSolrServer must write the xml directly to the server
[ https://issues.apache.org/jira/browse/SOLR-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-973: Attachment: SOLR-973.patch the xml is directly written to the server and it is same fo the binary update also > CommonsHttpSolrServer must write the xml directly to the server > > > Key: SOLR-973 > URL: https://issues.apache.org/jira/browse/SOLR-973 > Project: Solr > Issue Type: Improvement > Components: clients - java >Reporter: Noble Paul >Priority: Minor > Attachments: SOLR-973.patch > > > CommonsHttpSolrServer creates the whole xml in memory before it writes it to > the server. It effectively consumes 3 times the string size in memory before > writing it to the server > It can be directly written to the server and we can get faster upload -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675856#action_12675856 ] fergus edited comment on SOLR-1033 at 2/23/09 3:33 AM: - OK here goes. My document contains references to embeded imagery. For each image there is the image itself along with a thumbnail and caption. The source document contains:- I have a search application that searches only the captions associated with a given image. It would be nice to populate solr fields with the correct relative path to each image and thumbnails at index time. Problem arises in that although the thumbnail is: s${e.vurl}.jpg The name of the image itself varies depending on the first letter of the image type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. ie:- p${e.vurl}.jpg g${e.vurl}.jpg l${e.vurl}.jpg m${e.vurl}.jpg My patch would allow the following sort of thing to be added to a data-config. I feel this considerably increases its power and usefulness. {code} {code} was (Author: fergus): OK here goes. My document contains references to embeded imagery. For each image there is the image itself along with a thumbnail and caption. The source document contains:- I have a search application that searches only the captions associated with a given image. It would be nice to populate solr fields with the correct relative path to each image and thumbnails at index time. Problem arises in that although the thumbnail is: s${e.vurl}.jpg The name of the image itself varies depending on the first letter of the image type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. ie:- p${e.vurl}.jpg g${e.vurl}.jpg l${e.vurl}.jpg m${e.vurl}.jpg My patch would allow the following sort of thing to be added to a data-config. I feel this considerably increases its power and usefulness. {{code}} {{code}} > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675856#action_12675856 ] Fergus McMenemie commented on SOLR-1033: OK here goes. My document contains references to embeded imagery. For each image there is the image itself along with a thumbnail and caption. The source document contains:- I have a search application that searches only the captions associated with a given image. It would be nice to populate solr fields with the correct relative path to each image and thumbnails at index time. Problem arises in that although the thumbnail is: s${e.vurl}.jpg The name of the image itself varies depending on the first letter of the image type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. ie:- p${e.vurl}.jpg g${e.vurl}.jpg l${e.vurl}.jpg m${e.vurl}.jpg My patch would allow the following sort of thing to be added to a data-config. I feel this considerably increases its power and usefulness. {{code}} {{code}} > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations
[ https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675808#action_12675808 ] Noble Paul commented on SOLR-1033: -- bq.Sure. However I need a little help. What is it I need to do? A simple usecase with an example which demonstrates the feature . The TemplateTransformer example you provided was self explanatory. If you can give a similar one that is more than sufficient. > DIH transformers cannot reuse output from previous transformations > -- > > Key: SOLR-1033 > URL: https://issues.apache.org/jira/browse/SOLR-1033 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4 > Environment: All operating systems and software platforms >Reporter: Fergus McMenemie > Fix For: 1.4 > > Attachments: SOLR-1033.patch, SOLR-1033.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > It can be very useful to reuse the output from a DIH template in other > templates and or regex transformers. Currently this cannot be done. The > resolver is initialized at the start of the transformer run with what ever > values exist for a column name at that instant. As the transformer executes > it may define new values for column names. My change is intended to update > the hash used by the resolver after each successful transformation. > This only applies to the template and regex transformers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.