[ https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801331#comment-16801331 ]
John Mora commented on GORA-411: -------------------------------- Hi. I have just sent the PR. I also added a small test that compares exists(key) vs get(key) executing 100 checks and measuring the time. The results are very fluctuant, but on average the exists method is 10% faster. However, the test runs in local and the results would probably be different with real workloads and distributed setups. Please feel free to give your feedback. > Add exists(key) to DataStore interface > -------------------------------------- > > Key: GORA-411 > URL: https://issues.apache.org/jira/browse/GORA-411 > Project: Apache Gora > Issue Type: Improvement > Components: gora-core, storage > Reporter: Alfonso Nishikawa > Priority: Minor > Fix For: 0.9 > > > NUTCH-1679 need to check if there exists some rows and they are proposing to > use {{store.get(TableUtil.reverseUrl(url)))}}. > This will have a considerably impact on performance since every column will > be fetched. > Some datastores implements a call to just check if a row exists (like HBase) > so no data is transfered by network. > If a datastore can't handle an "exists" call, can default to a get. -- This message was sent by Atlassian JIRA (v7.6.3#76005)