[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Chris A. Mattmann (JIRA) Fri, 11 Dec 2009 14:04:42 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789561#action_12789561
 ]


Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hi Grant:

{quote}
My tests show it to be at least 7 times faster. But this should be obvious from 
static analysis, too. First of all, String.split() uses a regex which then 
makes a pass through the underlying character array. Then, trim has to go back 
through and analyze the char array too, not to mention the extra String 
creations. The optimized version here makes one pass and deals solely at the 
char array level and only has to do the substring, which I think can be 
optimized by the JVM to be a copy on write.
{quote}

Got it. A couple of points:

1. 7x faster is great, but could end up being noise if x = 2 ms. It matters if 
x is say 2 minutes, agreed. If it's on the ms end then the expense of more 
lines of (uncommented) code isn't worth it.
2. This code is likely to get called heavily on the indexing side, so 
performance, though still an issue, is not as hugely important as say on the 
searching side.
3. If you feel strongly about an optimized version of this magic splitAndTrim 
function, how ability a utility function and refactor then? I would guess this 
code could be used elsewhere, and that would help to satisfy my hunger for 
reusability. I'll even javadoc the function and do the refactor if you'd like.

Cheers,
Chris



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, 
> SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to 
> be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The 
> concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within 
> that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Reply via email to