[
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789561#action_12789561
]
Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------
Hi Grant:
{quote}
My tests show it to be at least 7 times faster. But this should be obvious from
static analysis, too. First of all, String.split() uses a regex which then
makes a pass through the underlying character array. Then, trim has to go back
through and analyze the char array too, not to mention the extra String
creations. The optimized version here makes one pass and deals solely at the
char array level and only has to do the substring, which I think can be
optimized by the JVM to be a copy on write.
{quote}
Got it. A couple of points:
1. 7x faster is great, but could end up being noise if x = 2 ms. It matters if
x is say 2 minutes, agreed. If it's on the ms end then the expense of more
lines of (uncommented) code isn't worth it.
2. This code is likely to get called heavily on the indexing side, so
performance, though still an issue, is not as hugely important as say on the
searching side.
3. If you feel strongly about an optimized version of this magic splitAndTrim
function, how ability a utility function and refactor then? I would guess this
code could be used elsewhere, and that would help to satisfy my hunger for
reusability. I'll even javadoc the function and do the refactor if you'd like.
Cheers,
Chris
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
> Key: SOLR-1131
> URL: https://issues.apache.org/jira/browse/SOLR-1131
> Project: Solr
> Issue Type: New Feature
> Components: Schema and Analysis
> Reporter: Ryan McKinley
> Assignee: Grant Ingersoll
> Fix For: 1.5
>
> Attachments: SOLR-1131-IndexMultipleFields.patch,
> SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt,
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch,
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to
> be indexed as a set of Fields (lucene Field). Consider SOLR-773. The
> concept "point" may be best indexed in a variety of ways:
> * geohash (sincle lucene field)
> * lat field, lon field (two double fields)
> * cartesian tiers (a series of fields with tokens to say if it exists within
> that region)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.