[hypertable-dev] Re: [hypertable-user] Pseudo-table proposal

Christoph Rupp Wed, 06 Mar 2013 08:54:42 -0800

It reminds me of the /proc system, but also of SQL Views. Basically we
provide additional metadata, and there's a View on that data which looks
and feels like a regular HQL table. I always found Views very useful to
give applications a consistent view of a table even if the underlying table
structure changes between different versions.


In Hypertable a View would just be a dispatcher to either the regular
column families or to the pseudo-tables. And later we could maybe implement
"real" Views if the need comes up.

bye
Christoph

2013/3/5 Doug Judd <[email protected]>

> This is a proposal for the introduction of *pseudo-tables* into
> Hypertable.  This idea came about when trying to come up with an
> inexpensive way to discover large rows in a table.  We zeroed in on the
> CellStore indexes because they contain information that can be used to
> estimate large rows cheaply.  However, the next question was how do we
> provide access to the CellStore indexdes through the API?  Instead of
> adding some special-purpose *ReadCellStoreIndexes* API, I propose that we
> use the existing API as-is and surface the CellStore index information via
> a *pseudo-table*.  A pseudo-table is a virtual table with no real table
> behind it.  When a query comes in for the CellStore index pseudo table,
> the CellStore indexes will get read directly to satisfy the query.  This
> approach is exactly analogous to the /proc filesystem in 
> Linux<http://www.ibm.com/developerworks/library/l-proc/index.html>
> .
>
> The pseudo-table that represents the CellStore indexes for a given table,
> *foo*, would have the name *foo*^.cellstore.index and the following
> schema:
>
> create table foo^.cellstore.index (
>   Size,
>   CompressedSize,
>   KeyCount
> );
>
> For each column family, there would be one qualified column for each block
> in the CellStore indexes.  The column qualifier would have the format:
> <filename>:<hex-offset>.  Also, the row key would be the same as the row
> key in the CellStore index entries (we assume that's what most people will
> want to aggregate this info on).  So for example, the CellStore index block
> entry for file 2/2/default/ZwmE_ShYJKgim-IL/cs103 at offset 0x28A61 might
> generate the following keys:
>
> [email protected]
>  Size:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61    171728
> [email protected]
>  CompressedSize:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61  65231
> [email protected]
>  KeyCount:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61        281
>
> To query the cellstore.index pseudo-table for table *foo* to find an
> estimate of large rows, you would issue a query along the lines of the
> following:
>
> SELECT sum(Size) FROM foo^.cellstore.index WHERE sum(Size) > 100000000;
>
> Please respond with feedback or if you have any questions.  Thanks!
>
> - Doug
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Hypertable User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/hypertable-user?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

[hypertable-dev] Re: [hypertable-user] Pseudo-table proposal

Reply via email to