On Wed, Jul 27, 2016 at 10:56 PM, Jacob Willoughby
<jake.willoug...@vivint.com> wrote:
> I have been investigating cassandra to store small objects as a trivial
> replacement for s3.  GET/PUT/DELETE are all easy, but LIST is what is
> tripping me up.
>
>
> S3 does a hierarchical list that kinda simulates traversing folders.
>
> http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysHierarchy.html
>
>
> So say my schema is this:
>
> CREATE TABLE "stuff" (key BLOB PRIMARY KEY, value BLOB)
>
>
> I know that the prefix part is easy with a ByteOrderedPartitioner (and
> possibly with a secondary index in Cassandra 3.x? ).  What trips me up is
> the delimiter part.

I don't think either of options are good.

> I have looked at a handful of open source projects that are s3 clones and
> use cassandra, and they seem to do the prefix match then manually search for
> the delimiter.  I have looked at doing a UDA, but they also seem to send all
> of the data to a single node to do the aggregation.
>
>
> What I am hoping to do is achieve what S3 does: "List performance is not
> substantially affected by the total number of keys in your bucket, nor by
> the presence or absence of the prefix, marker, maxkeys, or delimiter
> arguments." (
>
> http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysUsingAPIs.html)
>
>
> Is there some sort of denormalization, indexing, querying that I am missing
> that might help solve this?  I think if UDA's could do some summary
> operation on each node before returning it then aggregating the results it
> would work, but as far as I know that isn't possible.  It seems like a
> binary search of each partition involved in the list prefix would be a
> really quick and easy way to return the first 1000 results.
>
>
> Is this even possible using cassandra?

Perhaps you could just store the objects as a simple key-value
representation (like your "stuff" table above), and then separately
index the components of the key (presumably in another table), using
adjacency lists (https://en.wikipedia.org/wiki/Adjacency_list).


-- 
Eric Evans
john.eric.ev...@gmail.com

Reply via email to