[
https://issues.apache.org/jira/browse/LUCENE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196055#comment-13196055
]
Dawid Weiss commented on LUCENE-3725:
-------------------------------------
bq. If we didn't do this then the packer would have to use even less RAM
efficient data structures (eg Map<Int,X>) I think?
Yes, this is exactly what I used (although I used a primitive-backed hash maps
from HPPC), but the overhead will be there, sure.
bq. Second, the format written by the packer is tightly coupled with the FST
reading, ie there are sizable differences when reading packed vs unpacked FST.
Right. I have a different design in which the FSA is an abstract superclass and
the implementation provides methods to walk the edges/ nodes. The writers
simply walk that structure when serializing. Reading is delegated to a reader
that can understand a particular format (and then provide a traversal
implementation over raw bytes).
I do have major simplifications over Lucene's version so this wouldn't be easy
to do in Lucene's case without sacrificing performance.
> Add optional packing to FST building
> ------------------------------------
>
> Key: LUCENE-3725
> URL: https://issues.apache.org/jira/browse/LUCENE-3725
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/FSTs
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3725.patch, LUCENE-3725.patch, LUCENE-3725.patch,
> Perf.java
>
>
> The FSTs produced by Builder can be further shrunk if you are willing
> to spend highish transient RAM to do so... our Builder today tries
> hard not to use much RAM (and has options to tweak down the RAM usage,
> in exchange for somewhat lager FST), even when building immense FSTs.
> But for apps that can afford highish transient RAM to get a smaller
> net FST, I think we should offer packing.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]