date:20090223

[jira] Resolved: (SOLR-1010) Relative instanceDir is evaluated relative to current working directory

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1010.
-

Resolution: Invalid

> Relative instanceDir is evaluated relative to current working directory
> ---
>
> Key: SOLR-1010
> URL: https://issues.apache.org/jira/browse/SOLR-1010
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
>
> If a relative instanceDir is provided in solr.xml, it should be evaluated 
> relative to solr.home instead of the current working directory.
> I guess people work around this bug right now by using absolute paths for 
> instanceDir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1029) Standardize Evaluator parameter handling

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1029.
-

Resolution: Fixed

Committed revision 747291.

Thanks Noble!

> Standardize Evaluator parameter handling
> 
>
> Key: SOLR-1029
> URL: https://issues.apache.org/jira/browse/SOLR-1029
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1029.patch, SOLR-1029.patch
>
>
> The evaluator parameter syntax is not well defined and inconsistent in some 
> places. This issue aims to standardize on the following:
> # String parameters should be enclosed in a single quote
> # Variables can be written as-is
> # Numbers (integers or floats) can be written as-is
> We will enforce this among the existing evaluators and provide helper 
> functions for this parsing which future implementations can use.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1029) Standardize Evaluator parameter handling

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676207#action_12676207
 ] 

Shalin Shekhar Mangar commented on SOLR-1029:
-

I added a note on this for users upgrading from 1.3

> Standardize Evaluator parameter handling
> 
>
> Key: SOLR-1029
> URL: https://issues.apache.org/jira/browse/SOLR-1029
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1029.patch, SOLR-1029.patch
>
>
> The evaluator parameter syntax is not well defined and inconsistent in some 
> places. This issue aims to standardize on the following:
> # String parameters should be enclosed in a single quote
> # Variables can be written as-is
> # Numbers (integers or floats) can be written as-is
> We will enforce this among the existing evaluators and provide helper 
> functions for this parsing which future implementations can use.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1029) Standardize Evaluator parameter handling

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1029:


Attachment: SOLR-1029.patch

# Made the behavior backwards-compatible with old syntax.
# Added warning if old syntax is used.

I'll commit this shortly.

> Standardize Evaluator parameter handling
> 
>
> Key: SOLR-1029
> URL: https://issues.apache.org/jira/browse/SOLR-1029
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1029.patch, SOLR-1029.patch
>
>
> The evaluator parameter syntax is not well defined and inconsistent in some 
> places. This issue aims to standardize on the following:
> # String parameters should be enclosed in a single quote
> # Variables can be written as-is
> # Numbers (integers or floats) can be written as-is
> We will enforce this among the existing evaluators and provide helper 
> functions for this parsing which future implementations can use.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1035.
-

   Resolution: Fixed
Fix Version/s: 1.4
 Assignee: Shalin Shekhar Mangar

Committed revision 747284.

Thanks David!

> catch RuntimeException instead of NFE in FieldType line 187
> ---
>
> Key: SOLR-1035
> URL: https://issues.apache.org/jira/browse/SOLR-1035
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4
>Reporter: David Smiley
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> This is a super-easy request for a Solr commiter to do.  Look at line 187 of 
> FieldType.  It catches NumberFormatException (a subtype of RuntimeException) 
> when calling toInternal() in order to throw a SolrException that has useful 
> information -- namely the field attempting to be set and its value.  The 
> caught exception should be declared to be very broudly at RuntimeException 
> since problems other than NumberFormatException might occur.  An example is 
> SolrException thrown by DateField when parsing fails. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187

2009-02-23 Thread David Smiley (JIRA)

catch RuntimeException instead of NFE in FieldType line 187
---

 Key: SOLR-1035
 URL: https://issues.apache.org/jira/browse/SOLR-1035
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: David Smiley
Priority: Trivial


This is a super-easy request for a Solr commiter to do.  Look at line 187 of 
FieldType.  It catches NumberFormatException (a subtype of RuntimeException) 
when calling toInternal() in order to throw a SolrException that has useful 
information -- namely the field attempting to be set and its value.  The caught 
exception should be declared to be very broudly at RuntimeException since 
problems other than NumberFormatException might occur.  An example is 
SolrException thrown by DateField when parsing fails. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Improvement Request: exception setting field exception

2009-02-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

Even if it is a lightweight task (unless it is a comment that is to be
changed) it is wise to open a JIRA issue.


On Tue, Feb 24, 2009 at 10:30 AM, David Smiley @MITRE.org
 wrote:
>
> This is a super-easy request for a Solr commiter to do.  Look at line 187 of
> FieldType.  It catches NumberFormatException (a subtype of RuntimeException)
> when calling toInternal() in order to throw a SolrException that has useful
> information -- namely the field attempting to be set and its value.  The
> caught exception should be declared to be very broudly at RuntimeException
> since problems other than NumberFormatException might occur.  An example is
> SolrException thrown by DateField when parsing fails.
>
> I was about to create a JIRA issue for this but thought it might be more
> light-weight to ask on this list.  If a commiter would rather I submit to
> JIRA any way then let me know.
>
> ~ David Smiley
> --
> View this message in context: 
> http://www.nabble.com/Improvement-Request%3A-exception-setting-field-exception-tp22175880p22175880.html
> Sent from the Solr - Dev mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul

Improvement Request: exception setting field exception

2009-02-23 Thread David Smiley @MITRE.org


This is a super-easy request for a Solr commiter to do.  Look at line 187 of
FieldType.  It catches NumberFormatException (a subtype of RuntimeException)
when calling toInternal() in order to throw a SolrException that has useful
information -- namely the field attempting to be set and its value.  The
caught exception should be declared to be very broudly at RuntimeException
since problems other than NumberFormatException might occur.  An example is
SolrException thrown by DateField when parsing fails.

I was about to create a JIRA issue for this but thought it might be more
light-weight to ask on this list.  If a commiter would rather I submit to
JIRA any way then let me know.

~ David Smiley
-- 
View this message in context: 
http://www.nabble.com/Improvement-Request%3A-exception-setting-field-exception-tp22175880p22175880.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1033:
-

Attachment: SOLR-1033.patch

the complete patch. XPathEntityprocessor needed some rework

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch, 
> SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676165#action_12676165
 ] 

Noble Paul commented on SOLR-1033:
--

bq.You say "you are constructing the regex replacements themselves with 
templates" by which you mean using the ${XXX} syntax and not the output from a 
templatetransformer?

when I said 'template' I mean any string with ${xxx} content. the 'template' 
attribute is the only value Templatetransformer is interested in.

Any attribute value in DIH is potentially a template .Some are honoured and 
some are not. I hope we can consistently make it work across all.



> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1034) ClientUtils.escapeQuery should escape ;

2009-02-23 Thread Ryan McKinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-1034.
-

Resolution: Fixed

Committed revision 747260.

thanks brian

> ClientUtils.escapeQuery should escape ;
> ---
>
> Key: SOLR-1034
> URL: https://issues.apache.org/jira/browse/SOLR-1034
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.4
> Environment: all
>Reporter: Brian Whitman
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-ESCAPE.patch
>
>
> The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some 
> unexpected lexical errors when parsing query strings with ; in them.
> See
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1034) ClientUtils.escapeQuery should escape ;

2009-02-23 Thread Ryan McKinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley reassigned SOLR-1034:
---

Assignee: Ryan McKinley

> ClientUtils.escapeQuery should escape ;
> ---
>
> Key: SOLR-1034
> URL: https://issues.apache.org/jira/browse/SOLR-1034
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.4
> Environment: all
>Reporter: Brian Whitman
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-ESCAPE.patch
>
>
> The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some 
> unexpected lexical errors when parsing query strings with ; in them.
> See
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1034) ClientUtils.escapeQuery should escape ;

2009-02-23 Thread Brian Whitman (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Whitman updated SOLR-1034:


Attachment: SOLR-ESCAPE.patch

Patch to escape ;

> ClientUtils.escapeQuery should escape ;
> ---
>
> Key: SOLR-1034
> URL: https://issues.apache.org/jira/browse/SOLR-1034
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.4
> Environment: all
>Reporter: Brian Whitman
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-ESCAPE.patch
>
>
> The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some 
> unexpected lexical errors when parsing query strings with ; in them.
> See
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1034) ClientUtils.escapeQuery should escape ;

2009-02-23 Thread Brian Whitman (JIRA)

ClientUtils.escapeQuery should escape ;
---

 Key: SOLR-1034
 URL: https://issues.apache.org/jira/browse/SOLR-1034
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4
 Environment: all
Reporter: Brian Whitman
Priority: Minor
 Fix For: 1.4


The ClientUtils escapeQueryChars does not escape a ; symbol. This causes some 
unexpected lexical errors when parsing query strings with ; in them.

See

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200902.mbox/%3cdbd9700a0902231722n5db79dc0u7363603f930be...@mail.gmail.com%3e




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-02-23 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676136#action_12676136
 ] 

Hoss Man commented on SOLR-844:
---

I still think the javadocs should be beefed up a bit... at a minimum the info 
from "When to use this ?", "How does the Load Balancing happen ?", and "How 
does it know if a server has come back up ?" sections of the wiki page Noble 
made should be in the class level javadocs ... some of the "How does it 
know..." info made it into the javadocs for setAliveCheckInterval, but 
considering how important that method is there should be a ref to it in the 
class docs -- and there *definitely* needs to be some explicit mention of the 
"The ping is done not in a separate thread, it is done in a thread which made a 
normal request." fact, i would *never* have guessed that looking at the public 
docs in the class.

And as I mentioned before: if there are concerns that this class will be 
misused (and it certainly seems like there are) then it really needs to contain 
javadocs explaining when it *doesn't* make sense to use it and some alternative 
suggestions (if nothing else, a link to wikipedia: 
[http://en.wikipedia.org/wiki/Load_balancing_(computing)])


> A SolrServer impl to front-end multiple urls
> 
>
> Key: SOLR-844
> URL: https://issues.apache.org/jira/browse/SOLR-844
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch, 
> SOLR-844.patch, SOLR-844.patch
>
>
> Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
> demands that the user have a LoadBalancer or do the roundrobin on their own. 
> We must have a {{LBHttpSolrServer}} which must automatically do a 
> Loadbalancing between multiple hosts. This can be backed by the 
> {{CommonsHttpSolrServer}}
> This can have the following other features
> * Automatic failover
> * Optionally take in  a file /url containing the the urls of servers so that 
> the server list can be automatically updated  by periodically loading the 
> config
> * Support for adding removing servers during runtime
> * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
> random etc)
> * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

2009-02-23 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676120#action_12676120
 ] 

Hoss Man commented on SOLR-1019:


bq. The bottom line is that absolute FP comparisons can result in unpredictable 
behaviors, depending on the underlying implementation something that we want to 
avoid.

The behavior of equality operators on floats is fairly explicit and unambiguous 
in JLS s15.21 -- if a JVM implementation produces unexpected behavior on these 
comparisons that's a bug in the JVM implementation.

It's not ConstantScorePrefixQuery's job (or the job of any other Query class) 
to second guess whether the caller considers a boost value of X+epsilon to be 
"close enough" to a boost value of X to consider them the same query.

(FWIW: I agree that equality tests on float/double values tend to indicate  ... 
i just disagree with the assessment that there actually is a mistake in these 
specific cases)

> ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() 
> -
>
> Key: SOLR-1019
> URL: https://issues.apache.org/jira/browse/SOLR-1019
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Java 6, Tomcat 6
>Reporter: Kay Kay
> Attachments: SOLR-1019.patch
>
>
> Absolute FP comparisons are inherently unsafe.  Comparisons of FP done under 
> an epsilon resolution difference. 
> affects ConstantScorePrefixQuery#toString() and #equals() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675969#action_12675969
 ] 

fergus edited comment on SOLR-1033 at 2/23/09 3:06 PM:
-

Your comment about modifying the system "to have the current entities rows be 
available always to all transformers" is good and will produce the fastest most 
efficient code. 

But I need to sure we are not using the term "template" twice in different 
ways. You say "you are constructing the regex replacements themselves with 
templates" by which you mean using the ${XXX} syntax and not the output from a 
templatetransformer?

Anyway I have backed out my patch and applied yours. Everything seems fine, but 
I am still testing.

Thanks very much.

  was (Author: fergus):
Your comment about modifying the system "to have the current entities rows 
be available always to all transformers" is good and will produce the fastest 
most efficient code. 

But I need to sure we are not using the term "template" twice in different 
ways. You say "you are constructing the regex replacements themselves with 
templates" by which you mean using the ${XXX} syntax and not the output from a 
templatetransformer?

Also, is your patch a replacement for mine?
  
> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2009-02-23 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676041#action_12676041
 ] 

Hoss Man commented on SOLR-799:
---

The separation of concerns between schema.xml and solrconfig.xml has always 
been...

 * schema.xml: what is the data, what is it's nature, what are it's intrinsic 
properties?
 * solrconfig.xml: what can people do with your data, how can they use it?

fields, fieldTypes, analyzers, copyFields go in the schema.xml because they are 
(in theory) intrinsic to the nature of your data regardless of where a given 
document comes from: 
 * documents should only have one author
 * categoryName should always be tokenized in a particular way
 * prices need to sort numericly not lexigraphicallyy
 * any text indexed in the shortSummary field shoudl also be indexed in the 
searchableAbstract field
 * etc...

request handlers that dictate how people can use the data are specified in 
solrconfig.xml -- when searching data request handlers (which may leverage 
search componets) dictate what a user is allowed to get/see;  when modifying an 
index request handlers (which may leverage update processors) dictate what data 
is allowed to come from various sources and in what formats.

In short: as far as document indexing goes, the options configured in 
solrconfig.xml specify how to "build up" a Document object from user input, 
while the options in schema.xml specify how to "tear it down" into it's 
individual terms and values for indexing.

With the near duplicate detection code, it is the schema's job to say which 
fields can exist in the input documents, including a signature field --  but it 
is the solrconfig's job to decide how to compute that signature field ... after 
all: the computation might be different depending on the source of the data 
(ie: different processor chains could be configured for different request 
handlers)

> Add support for hash based exact/near duplicate document handling
> -
>
> Key: SOLR-799
> URL: https://issues.apache.org/jira/browse/SOLR-799
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Mark Miller
>Assignee: Yonik Seeley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-799.patch, SOLR-799.patch, SOLR-799.patch, 
> SOLR-799.patch
>
>
> Hash based duplicate document detection is efficient and allows for blocking 
> as well as field collapsing. Lets put it into solr. 
> http://wiki.apache.org/solr/Deduplication

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675969#action_12675969
 ] 

Fergus McMenemie commented on SOLR-1033:


Your comment about modifying the system "to have the current entities rows be 
available always to all transformers" is good and will produce the fastest most 
efficient code. 

But I need to sure we are not using the term "template" twice in different 
ways. You say "you are constructing the regex replacements themselves with 
templates" by which you mean using the ${XXX} syntax and not the output from a 
templatetransformer?

Also, is your patch a replacement for mine?

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

2009-02-23 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675963#action_12675963
 ] 

Yonik Seeley commented on SOLR-1019:


Absolute comparisons aren't good for the result of floating point calculations, 
but things like boosts are pretty much always set from constants.  For the very 
rare case that these constants are calculated, it should be up to the code 
doing the calculation to round or normalize if that is desired.  In any case, 
the failure mode in this extremely unlikely event is still graceful - 
everything still works but the cache hit ratio goes down.

Another point:  from the Lucene point of view, a boost of 0.0 and a boost of 
.001 are *very* different - lucene screens out scores <=0.0f (although Solr 
does not).

> ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() 
> -
>
> Key: SOLR-1019
> URL: https://issues.apache.org/jira/browse/SOLR-1019
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Java 6, Tomcat 6
>Reporter: Kay Kay
> Attachments: SOLR-1019.patch
>
>
> Absolute FP comparisons are inherently unsafe.  Comparisons of FP done under 
> an epsilon resolution difference. 
> affects ConstantScorePrefixQuery#toString() and #equals() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1033:
-

Attachment: SOLR-1033.patch

This should help all other transformers implicitly support templating

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675953#action_12675953
 ] 

noble.paul edited comment on SOLR-1033 at 2/23/09 7:54 AM:
---

OK , I see your point. you are constructing the regex replacements themselves 
with templates. I missed that

I am wondering , if the system can be modified to have the current entities 
rows be available always to all transformers. It can be done as a simple change 
in the EntityprocessorBase#applyTransformers

  was (Author: noble.paul):
OK , I see your point. you are constructing the regex replacements 
themselves with templates. I missed that
  
> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675953#action_12675953
 ] 

Noble Paul commented on SOLR-1033:
--

OK , I see your point. you are constructing the regex replacements themselves 
with templates. I missed that

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

2009-02-23 Thread Kay Kay (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675945#action_12675945
 ] 

Kay Kay commented on SOLR-1019:
---

I believe either of the scenarios in which the fix might be invalid would be 
related to the value of epsilon . If we keep the value of epsilon small enough 
- Math.Double.MIN_VALUE - we might get around the issue. The bottom line is 
that absolute FP comparisons can result in unpredictable behaviors, depending 
on the underlying implementation something that we want to avoid. 

> ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost() 
> -
>
> Key: SOLR-1019
> URL: https://issues.apache.org/jira/browse/SOLR-1019
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Java 6, Tomcat 6
>Reporter: Kay Kay
> Attachments: SOLR-1019.patch
>
>
> Absolute FP comparisons are inherently unsafe.  Comparisons of FP done under 
> an epsilon resolution difference. 
> affects ConstantScorePrefixQuery#toString() and #equals() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675902#action_12675902
 ] 

Fergus McMenemie commented on SOLR-1033:


Not sure I am following what you say. If I number the different steps in my 
example entity as follows:-

{code}

1  
2  
3  
4  
5  
{code}

We see that column 5 involves a regex which in turn involves columns 3 and 2. 
Column 3 is itself a regex. We therefore have the output from one regex being 
used within another regex. So as far as I can see we need the fix made to both 
the TemplateTransformer and the RegexTransformer. 

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: SpellCheckComponent build from search

2009-02-23 Thread Shalin Shekhar Mangar

On Mon, Feb 23, 2009 at 10:47 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> You may only need to add a parameter for the query and override the init
> and build method in IndexBasedSpellChecker.


Along with a custom Dictionary implementation.

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-739) Add support for OmitTf

2009-02-23 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675882#action_12675882
 ] 

Mark Miller commented on SOLR-739:
--

Good point, will do.

- Mark

> Add support for OmitTf
> --
>
> Key: SOLR-739
> URL: https://issues.apache.org/jira/browse/SOLR-739
> Project: Solr
>  Issue Type: New Feature
>Reporter: Mark Miller
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-739.patch, SOLR-739.patch
>
>
> Allow setting omitTf in the field schema. Default to true for all but text 
> fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-02-23 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675864#action_12675864
 ] 

Shalin Shekhar Mangar commented on SOLR-844:


So what's the verdict on this one? Seems like most concerns were taken care of. 
Anything else we should do to get this committed?

> A SolrServer impl to front-end multiple urls
> 
>
> Key: SOLR-844
> URL: https://issues.apache.org/jira/browse/SOLR-844
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch, 
> SOLR-844.patch, SOLR-844.patch
>
>
> Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
> demands that the user have a LoadBalancer or do the roundrobin on their own. 
> We must have a {{LBHttpSolrServer}} which must automatically do a 
> Loadbalancing between multiple hosts. This can be backed by the 
> {{CommonsHttpSolrServer}}
> This can have the following other features
> * Automatic failover
> * Optionally take in  a file /url containing the the urls of servers so that 
> the server list can be automatically updated  by periodically loading the 
> config
> * Support for adding removing servers during runtime
> * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
> random etc)
> * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675860#action_12675860
 ] 

Noble Paul commented on SOLR-1033:
--

If I am not wrong the output of one transformation in Regextransformer is 
available in the next transformation , becaus ethe value is added to the same 
row object . So it should be working if the TemplateTransformer is fixed

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-973) CommonsHttpSolrServer must write the xml directly to the server

2009-02-23 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-973:


Attachment: SOLR-973.patch

the xml is directly written to the server and it is same fo the binary update 
also 

> CommonsHttpSolrServer must write the xml directly to the server 
> 
>
> Key: SOLR-973
> URL: https://issues.apache.org/jira/browse/SOLR-973
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Noble Paul
>Priority: Minor
> Attachments: SOLR-973.patch
>
>
> CommonsHttpSolrServer creates the whole xml in memory before it writes it to 
> the server.  It effectively consumes 3 times the string size in memory before 
> writing it to the server
> It can be directly written to the server and we can get faster upload

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675856#action_12675856
 ] 

fergus edited comment on SOLR-1033 at 2/23/09 3:33 AM:
-

OK here goes. My document contains references to embeded imagery. For each 
image there is the image itself along with a thumbnail and caption. The source 
document contains:-

  

I have a search application that searches only the captions associated with a 
given image. It would be nice to populate solr fields with the correct relative 
path to each image and thumbnails at index time. Problem arises in that 
although the thumbnail is:

   s${e.vurl}.jpg

The name of the image itself varies depending on the first letter of the image 
type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. 
ie:-

   p${e.vurl}.jpg
   g${e.vurl}.jpg
   l${e.vurl}.jpg
   m${e.vurl}.jpg

My patch would allow the following sort of thing to be added to a data-config. 
I feel this considerably increases its power and usefulness.

{code}

  
  
  
  
  
{code}


  was (Author: fergus):
OK here goes. My document contains references to embeded imagery. For each 
image there is the image itself along with a thumbnail and caption. The source 
document contains:-

  

I have a search application that searches only the captions associated with a 
given image. It would be nice to populate solr fields with the correct relative 
path to each image and thumbnails at index time. Problem arises in that 
although the thumbnail is:

   s${e.vurl}.jpg

The name of the image itself varies depending on the first letter of the image 
type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. 
ie:-

   p${e.vurl}.jpg
   g${e.vurl}.jpg
   l${e.vurl}.jpg
   m${e.vurl}.jpg

My patch would allow the following sort of thing to be added to a data-config. 
I feel this considerably increases its power and usefulness.

{{code}}

  
  
  
  
  
{{code}}

  
> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675856#action_12675856
 ] 

Fergus McMenemie commented on SOLR-1033:


OK here goes. My document contains references to embeded imagery. For each 
image there is the image itself along with a thumbnail and caption. The source 
document contains:-

  

I have a search application that searches only the captions associated with a 
given image. It would be nice to populate solr fields with the correct relative 
path to each image and thumbnails at index time. Problem arises in that 
although the thumbnail is:

   s${e.vurl}.jpg

The name of the image itself varies depending on the first letter of the image 
type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. 
ie:-

   p${e.vurl}.jpg
   g${e.vurl}.jpg
   l${e.vurl}.jpg
   m${e.vurl}.jpg

My patch would allow the following sort of thing to be added to a data-config. 
I feel this considerably increases its power and usefulness.

{{code}}

  
  
  
  
  
{{code}}


> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

2009-02-23 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675808#action_12675808
 ] 

Noble Paul commented on SOLR-1033:
--

bq.Sure. However I need a little help. What is it I need to do?
A simple usecase with an example which demonstrates the feature .

The TemplateTransformer example you provided was self explanatory. If you can 
give a similar one that is more than sufficient.

> DIH transformers cannot reuse output from previous transformations
> --
>
> Key: SOLR-1033
> URL: https://issues.apache.org/jira/browse/SOLR-1033
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
> Environment: All operating systems and software platforms
>Reporter: Fergus McMenemie
> Fix For: 1.4
>
> Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1010) Relative instanceDir is evaluated relative to current working directory

[jira] Resolved: (SOLR-1029) Standardize Evaluator parameter handling

[jira] Commented: (SOLR-1029) Standardize Evaluator parameter handling

[jira] Updated: (SOLR-1029) Standardize Evaluator parameter handling

[jira] Resolved: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187

[jira] Created: (SOLR-1035) catch RuntimeException instead of NFE in FieldType line 187

Re: Improvement Request: exception setting field exception

Improvement Request: exception setting field exception

[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Resolved: (SOLR-1034) ClientUtils.escapeQuery should escape ;

[jira] Assigned: (SOLR-1034) ClientUtils.escapeQuery should escape ;

[jira] Updated: (SOLR-1034) ClientUtils.escapeQuery should escape ;

[jira] Created: (SOLR-1034) ClientUtils.escapeQuery should escape ;

[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

[jira] Updated: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1019) ConstantScorePrefixQuery.java - Absolute FP comparison on getBoost()

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

Re: SpellCheckComponent build from search

[jira] Commented: (SOLR-739) Add support for OmitTf

[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Updated: (SOLR-973) CommonsHttpSolrServer must write the xml directly to the server

[jira] Issue Comment Edited: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

[jira] Commented: (SOLR-1033) DIH transformers cannot reuse output from previous transformations

33 matches

Site Navigation

Mail list logo

Footer information