[
https://issues.apache.org/jira/browse/LUCENE-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005502#comment-13005502
]
Shai Erera commented on LUCENE-2958:
------------------------------------
If we do the header idea, then we'll need to move to a more generic DocData. So
instead of doing docData.title = title, you'll need to do docData.set("title",
title), which under the hood will store that pair in a Map or Properties.
Similarly for 'getter'. That also has some implications on perf.
What is better - generality or optimized code for the common Lucene tasks and
let users extend for their own purposes?
If we want to have the most optimized code, then let's pass the line entirely
to an overridable method. Lucene will offer an optimized way of tokenizing the
current fields, while the user will have to either provide his own optimized
way (for his fields), or decide that he can risk some cycles in favor of
simpler code (e.g., calling line.split()).
> WriteLineDocTask improvements
> -----------------------------
>
> Key: LUCENE-2958
> URL: https://issues.apache.org/jira/browse/LUCENE-2958
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/benchmark
> Reporter: Doron Cohen
> Assignee: Doron Cohen
> Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-2958.patch, LUCENE-2958.patch
>
>
> Make WriteLineDocTask and LineDocSource more flexible/extendable:
> * allow to emit lines also for empty docs (keep current behavior as default)
> * allow more/less/other fields
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]