[
https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801331#comment-16801331
]
John Mora commented on GORA-411:
--------------------------------
Hi.
I have just sent the PR. I also added a small test that compares exists(key) vs
get(key) executing 100 checks and measuring the time. The results are very
fluctuant, but on average the exists method is 10% faster. However, the test
runs in local and the results would probably be different with real workloads
and distributed setups.
Please feel free to give your feedback.
> Add exists(key) to DataStore interface
> --------------------------------------
>
> Key: GORA-411
> URL: https://issues.apache.org/jira/browse/GORA-411
> Project: Apache Gora
> Issue Type: Improvement
> Components: gora-core, storage
> Reporter: Alfonso Nishikawa
> Priority: Minor
> Fix For: 0.9
>
>
> NUTCH-1679 need to check if there exists some rows and they are proposing to
> use {{store.get(TableUtil.reverseUrl(url)))}}.
> This will have a considerably impact on performance since every column will
> be fetched.
> Some datastores implements a call to just check if a row exists (like HBase)
> so no data is transfered by network.
> If a datastore can't handle an "exists" call, can default to a get.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)