[
https://issues.apache.org/jira/browse/DRILL-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147194#comment-14147194
]
Aditya Kishore commented on DRILL-1414:
---------------------------------------
So I have been thinking about couple of ways to do it.
# Extend {{org.apache.drill.exec.store.sys.PStore}} interface to add two
additional functions
{code}
public V getBlob(String key);
public void putBlob(String key, V value);
{code}
Now these two methods can be used by the consumers to store large amount of
data, that may not require frequent enumeration and not suitable for storage on
systems like Zookeeper. A particular PStore implementation could choose to
store the blob data differently than the primary value, for example, HBase
PStore provider could store them in a different column family while Zookeeper
PStore provider can store them on DFS (as this JIRA summary suggests).
The Query Profile, then can be split into two part where small, meta info about
the query is stored with a {{put()}} while the fragment profiles are stored
using {{putBlob()}}.
# Alternatively, we could handle this narrowly by just modifying
{{org.apache.drill.exec.work.foreman.QueryStatus}} to split and store the
profile meta data separately form individual query profile.
I am inclined to go with approach #1 as it will allow any future consumer to
reuse it effortlessly. I already have a partial patch, excluding modification
to the Web UI, that I am currently testing at this moment. If I do not hear any
concern with the approach #1, I'll post the patch shortly for the review.
> Move profile storage to DFS rather than using PStore
> ----------------------------------------------------
>
> Key: DRILL-1414
> URL: https://issues.apache.org/jira/browse/DRILL-1414
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Jacques Nadeau
> Assignee: Aditya Kishore
> Fix For: 0.6.0
>
>
> PStores were really built for trivial configuration data, not large query
> profiles. As such, we should move to using the DFS for storage of query
> profiles when distributed mode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)