[jira] [Commented] (HBASE-13099) Scans as in DynamoDB

2015-02-26 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338467#comment-14338467
 ] 

Nicolas Liochon commented on HBASE-13099:
-

The 1mb could be changed / made configurable.

The scan could finish if we are at the end of a row and one of these conditions 
is met:
 - we already have more than XX Mb and
 - the scan has been running for more than YY seconds
 - the scan reached the end of a region

This could simplify some code, and make the server less sensitive to client 
issues.

This would allow to remove the small scan code in the client as well (and, for 
all the clients that are doing small scans w/o setting this small flag, it 
would be faster).





 Scans as in DynamoDB
 

 Key: HBASE-13099
 URL: https://issues.apache.org/jira/browse/HBASE-13099
 Project: HBase
  Issue Type: Brainstorming
  Components: Client, regionserver
Reporter: Nicolas Liochon

 cc: [~saint@gmail.com] - as discussed offline.
 DynamoDB has a very simple way to manage scans server side:
 ??citation??
 The data returned from a Query or Scan operation is limited to 1 MB; this 
 means that if you scan a table that has more than 1 MB of data, you'll need 
 to perform another Scan operation to continue to the next 1 MB of data in the 
 table.
 If you query or scan for specific attributes that match values that amount to 
 more than 1 MB of data, you'll need to perform another Query or Scan request 
 for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
 the previous request, and use that value as the ExclusiveStartKey in the next 
 request. This will let you progressively query or scan for new data in 1 MB 
 increments.
 When the entire result set from a Query or Scan has been processed, the 
 LastEvaluatedKey is null. This indicates that the result set is complete 
 (i.e. the operation processed the “last page” of data).
 ??citation??
 This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13099) Scans as in DynamoDB

2015-02-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336521#comment-14336521
 ] 

stack commented on HBASE-13099:
---

We use state of Result (null, empty) to flag on client side state of scan. 
[~jonathan.lawlor] is adding 'partial' flag on result now to do 'chunking', to 
indicate the Result is a partial on the row which a client probably doesn't 
care about but the running Scan does (this flag is overloaded).

Where would we tag on the LastEvaluatedKey?  Would it just be the last KV in 
the Result?  Could client-side scan read this and use it going back to the 
server?

Would be good disconnecting client and server.

On serverside, when a lease expires, we do this to clean up outstanding region 
scanners:

@Override
public synchronized void close() {
  if (storeHeap != null) {
storeHeap.close();
storeHeap = null;
  }
  if (joinedHeap != null) {
joinedHeap.close();
joinedHeap = null;
  }
  // no need to synchronize here.
  scannerReadPoints.remove(this);
  this.filterClosed = true;
}

Probably need to keep the above or at least revisit too.  A timer on scanner 
serverside with returning after we've done 10 seconds or 1MB is coming up 
in issues elsewhere. The serverside lease-checking facility might be the place 
to do this -- it already tries to clean up expired serverside scanners. It 
could on a period check outstanding scans for where they are.  Probably better 
to just rip out this lease checking thing and move the checks into the region 
scanner itself; it will know where it is and so rather than have foreign thread 
interrupt, interrupt itself (works unless scanner gets stuck -- but I'd guess 
Lease interrupting running scanner probably don't work either).

 Scans as in DynamoDB
 

 Key: HBASE-13099
 URL: https://issues.apache.org/jira/browse/HBASE-13099
 Project: HBase
  Issue Type: Brainstorming
  Components: Client, regionserver
Reporter: Nicolas Liochon

 cc: [~saint@gmail.com] - as discussed offline.
 DynamoDB has a very simple way to manage scans server side:
 ??citation??
 The data returned from a Query or Scan operation is limited to 1 MB; this 
 means that if you scan a table that has more than 1 MB of data, you'll need 
 to perform another Scan operation to continue to the next 1 MB of data in the 
 table.
 If you query or scan for specific attributes that match values that amount to 
 more than 1 MB of data, you'll need to perform another Query or Scan request 
 for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
 the previous request, and use that value as the ExclusiveStartKey in the next 
 request. This will let you progressively query or scan for new data in 1 MB 
 increments.
 When the entire result set from a Query or Scan has been processed, the 
 LastEvaluatedKey is null. This indicates that the result set is complete 
 (i.e. the operation processed the “last page” of data).
 ??citation??
 This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13099) Scans as in DynamoDB

2015-02-25 Thread Jonathan Lawlor (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336928#comment-14336928
 ] 

Jonathan Lawlor commented on HBASE-13099:
-

Interesting idea. This seems like it would make the client-server interaction 
during Scans much cleaner. Instead of assuming that the server understands the 
state that the Client thinks it is in, it would be much more explicit, along 
the lines of I am in this state, give me these Results.

We would probably want the LastEvaluatedKey to be an extra parameter in the RPC 
response, rather than assumed to be the last KV in the Result. I think this 
would be preferable because it is possible that keys further down in the table 
were evaluated but filtered out. If we assume it to be the last KV in the 
Result we may find that we are constantly rescanning KV's that were previously 
excluded, only to find out that they will still be excluded.

Moving the state from the server to the client would require adding more 
parameters into the RPC response. As mentioned above, LastEvaluatedKey would 
likely be one of the parameters. Another parameter would likely be the MVCC 
read point that is currently maintained within the RegionScanner.

While this would make the interactions cleaner, I wonder how this would affect 
the performance of Scans. How I am currently imagining this (correct me if I'm 
wrong), it seems like we would incur an extra overhead on each scan due to the 
extra initialization required server side. On each scan RPC we would need to 
create a new RegionScanner, setup the key value heaps, seek to the correct row, 
and then potentially filter out the key values that we have already evaluated. 
This overhead is currently avoided by sending along the open scanner id from 
the client to the server so that the already setup scanner just continues where 
it left off.

If the move to client-side-state could be done without incurring any 
performance loss, I think this would be a great improvement that would make 
scans easier to understand.

 Scans as in DynamoDB
 

 Key: HBASE-13099
 URL: https://issues.apache.org/jira/browse/HBASE-13099
 Project: HBase
  Issue Type: Brainstorming
  Components: Client, regionserver
Reporter: Nicolas Liochon

 cc: [~saint@gmail.com] - as discussed offline.
 DynamoDB has a very simple way to manage scans server side:
 ??citation??
 The data returned from a Query or Scan operation is limited to 1 MB; this 
 means that if you scan a table that has more than 1 MB of data, you'll need 
 to perform another Scan operation to continue to the next 1 MB of data in the 
 table.
 If you query or scan for specific attributes that match values that amount to 
 more than 1 MB of data, you'll need to perform another Query or Scan request 
 for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
 the previous request, and use that value as the ExclusiveStartKey in the next 
 request. This will let you progressively query or scan for new data in 1 MB 
 increments.
 When the entire result set from a Query or Scan has been processed, the 
 LastEvaluatedKey is null. This indicates that the result set is complete 
 (i.e. the operation processed the “last page” of data).
 ??citation??
 This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13099) Scans as in DynamoDB

2015-02-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337017#comment-14337017
 ] 

Enis Soztutar commented on HBASE-13099:
---

I think we may have to keep at least some state in the server, even if we do a 
cell-based scanner. Our contract is per-row atomicity, so we have to keep track 
of: 
1. read point while scanning inside a row. 
2. low watermark for the read points across all open scanners for the region. 

(1) can even be extended to be a region based contract if we consider atomic 
updates cross-row using the MultiRowMutationEndpoint. (2) is needed for 
effectively getting rid of seqId's of cells in hfiles. 

We keep (1) in the server side right now, and we use the row-based scanner 
contract for (1). The client either gets the whole row, or not. The scanner can 
be restarted across rows, which changes the scanner read point, but it is fine 
since there is no guarantees across rows for visibility (excluding single 
region multi-row transactions). 

From a semantics point of view, (1) can be achieved with sending the read 
point to the client everytime a scan is started within a region. The client 
will keep track of 1 read point per region. Any subsequent scans performed 
from the client in the region will also send this read point to the server so 
that the scan does not see partial data. (2) can be solved by either not 
deleting seqId's of cells in hfiles (which we do to optimize disk usage), or 
keeping track of all open scanners' read points which requires still some 
state (even though very small) in the server. 

 Scans as in DynamoDB
 

 Key: HBASE-13099
 URL: https://issues.apache.org/jira/browse/HBASE-13099
 Project: HBase
  Issue Type: Brainstorming
  Components: Client, regionserver
Reporter: Nicolas Liochon

 cc: [~saint@gmail.com] - as discussed offline.
 DynamoDB has a very simple way to manage scans server side:
 ??citation??
 The data returned from a Query or Scan operation is limited to 1 MB; this 
 means that if you scan a table that has more than 1 MB of data, you'll need 
 to perform another Scan operation to continue to the next 1 MB of data in the 
 table.
 If you query or scan for specific attributes that match values that amount to 
 more than 1 MB of data, you'll need to perform another Query or Scan request 
 for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
 the previous request, and use that value as the ExclusiveStartKey in the next 
 request. This will let you progressively query or scan for new data in 1 MB 
 increments.
 When the entire result set from a Query or Scan has been processed, the 
 LastEvaluatedKey is null. This indicates that the result set is complete 
 (i.e. the operation processed the “last page” of data).
 ??citation??
 This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13099) Scans as in DynamoDB

2015-02-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338032#comment-14338032
 ] 

Lars Hofhansl commented on HBASE-13099:
---

That's what small scans do (in a nutshell), when they are not small :)

That does mean that at every 1mb chunk we need to reseek are 
{region|store|storeFile}Scanners. I.e. the server state allows us to avoid the 
expensive seeking each RPC. Maybe with 1mb chunks it does not matter. (but you 
can pull 1mb over 1ge in  10ms, which is less then the seek time of an HDD).

Some of the chunking logic we get with HBASE-12976.


 Scans as in DynamoDB
 

 Key: HBASE-13099
 URL: https://issues.apache.org/jira/browse/HBASE-13099
 Project: HBase
  Issue Type: Brainstorming
  Components: Client, regionserver
Reporter: Nicolas Liochon

 cc: [~saint@gmail.com] - as discussed offline.
 DynamoDB has a very simple way to manage scans server side:
 ??citation??
 The data returned from a Query or Scan operation is limited to 1 MB; this 
 means that if you scan a table that has more than 1 MB of data, you'll need 
 to perform another Scan operation to continue to the next 1 MB of data in the 
 table.
 If you query or scan for specific attributes that match values that amount to 
 more than 1 MB of data, you'll need to perform another Query or Scan request 
 for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
 the previous request, and use that value as the ExclusiveStartKey in the next 
 request. This will let you progressively query or scan for new data in 1 MB 
 increments.
 When the entire result set from a Query or Scan has been processed, the 
 LastEvaluatedKey is null. This indicates that the result set is complete 
 (i.e. the operation processed the “last page” of data).
 ??citation??
 This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)