Re: GeodeRedisAdapter improvments/feedback

Dan Smith Wed, 15 Feb 2017 13:30:25 -0800

Doing the spill/unspill option could be pretty tricky to implement, so you
have to do a lot of fancy logic in the transition period. I think Jason's
suggestion of configuring things might make more sense.


-Dan

On Wed, Feb 15, 2017 at 1:12 PM, Jason Huynh <jhu...@pivotal.io> wrote:

> With the suggestion from Wes, the constraint on the names would have to
> apply for both small and large.  We wouldn't want the thing to explode when
> it gets converted...
>
> Is there a way to just make it configurable?  If they know they want a
> "large" set, somehow let them specify it.  Otherwise go with the "small"
> set?
>
> On Wed, Feb 15, 2017 at 1:01 PM Real Wes <thereal...@outlook.com> wrote:
>
> > Thinking about this, I think that the “spill”/ “unspill” option may
> > actually be the best solution.  If the criteria waffles back and forth
> > along the threshold, well, that’s the acceptable worst case.
> >
> > How’s this?:
> >
> > 1) Create a separate region for the collection key
> >      - for fat collections that are updated frequently
> > ADVANTAGE: speed of replication
> > DISADVANTAGE: constraint on key name
> >
> > 2) Put the collection as an entry value:
> >    - for small collections and read-only fat collections
> > ADVANTAGE: no need to create a separate region
> >
> > We would track the metrics and automatically convert based on a
> > combination of frequency of updates and size.
> >
> > We next define what a fat collection is, such as over nnMB.
> >
> >
> > On Feb 14, 2017, at 8:12 PM, Jason Huynh <jhu...@pivotal.io<mailto:
> > jhu...@pivotal.io>> wrote:
> >
> > The concern about the threshold to spill over would be do you "unspill"
> > over?  Like what if the collection contracts under the threshold and
> > teeters around the threshold.  If the user can configure this size, then
> > wouldn't they just know they want a "large" vs a "small?"
> >
> > I think Swapnil makes a good point that our value add would be that we
> can
> > scale those structures, whereas redis can already do what the "new"
> > implementation is doing.
> >
> >
> >
> > On Tue, Feb 14, 2017 at 4:59 PM Galen M O'Sullivan <
> gosulli...@pivotal.io
> > <mailto:gosulli...@pivotal.io>> wrote:
> > If we put them in separate regions, we'll have the overhead of looking up
> > in two regions added to each and every operation, and the overhead of
> > creating all these regions.
> >
> > If we really wanted to we could have some threshold at which we spill
> > collections over into their own regions, and have something like the best
> > of both worlds. It's more complex, though, and I don't know how many
> people
> > actually use truly huge collections.
> >
> > On Tue, Feb 14, 2017 at 4:21 PM, Hitesh Khamesra <
> > hitesh...@yahoo.com.invalid<mailto:hitesh...@yahoo.com.invalid>> wrote:
> >
> > > Jason/Dan: Sorry to hear about that. But both of you have asked the
> right
> > > question.
> > > it depends on your use-case(item 2,3,4,5) . For example "hashes" can be
> > > use to define key-value pair or java bean. In this case  probably it is
> > > better to keep that hash at region-entry level.  But if you want to
> know
> > > top 10 tweets which are trending then probably you want use
> > > partition-region for "sorted-set".
> > >
> > >
> > >       From: Jason Huynh <jhu...@pivotal.io<mailto:jhu...@pivotal.io>>
> > >  To: dev@geode.apache.org<mailto:dev@geode.apache.org>; "
> > u...@geode.apache.org<mailto:u...@geode.apache.org>" <
> > u...@geode.apache.org<mailto:u...@geode.apache.org>>;
> > > Hitesh Khamesra <hitesh...@yahoo.com<mailto:hitesh...@yahoo.com>>
> > >  Sent: Tuesday, February 14, 2017 3:15 PM
> > >  Subject: Re: GeodeRedisAdapter improvments/feedback
> > >
> > > Hi Hitesh,
> > >
> > > Not sure about everyone else, but I had a hard time reading this,
> > however
> > > I think I figured out what you were describing... the only part I still
> > am
> > > unsure about is  Feedback/vote: both behaviour is desirable.  Do you
> mean
> > > you want feedback and voting on whether both behaviors are desired?  As
> > in
> > > old implementation and new implementation?
> > >
> > > 2,3,4)  The new implementation would mean all the data for a specific
> > data
> > > structure is contained in a single bucket.  So the individual data
> > > structures are not quite scalable.  How would you allow scaling of a
> > single
> > > data structure?
> > >
> > > On Tue, Feb 14, 2017 at 3:05 PM Real Wes <thereal...@outlook.com<
> mailto:
> > thereal...@outlook.com>> wrote:
> > >
> > > > In what format do you want the feedback Hitesh?  For now I’ll just
> > > comment:
> > > >
> > > > 1. Redis Type String
> > > > No comments except that a future Geode value-add would be to extend
> the
> > > > Jedis client so that the K/V’s are not compressed. In this way OQL
> and
> > CQ
> > > > will work.  The tradeoff of this is that the data cannot be read by a
> > > > native redis client but for Geode users it’s great. Call the new
> client
> > > > Geodis.
> > > >
> > > > 2. List/ Hash/ Set/ SortedSet
> > > > Creating a separate region for each creates a constraint that the
> keys
> > > are
> > > > limited to the characters for region names, which are A-z/0-9/ - and
> _.
> > > > Everything else is out. Redis users might start asking questions why
> > > their
> > > > list named ++^^/## throws an error. Your suggestion to make it a key
> > > rather
> > > > than a region solves this. Furthermore, creating a new region every
> > time
> > > a
> > > > new Redis collection is created is going to be slow. I’m not sure
> why a
> > > > region was created but I’m sure it made sense to the developer at the
> > > time.
> > > >
> > > > 7. Default Config
> > > > Can’t we configure a gfsh option to default to the region types we
> > want?
> > > > Customer A will want PARTITION but Customer B will want
> > > > PARTITION_REDUNDANT_EXPIRATION_PERSISTENT.  I wonder if we can
> consider
> > > a
> > > > geode> create region —redisType=PARTITION_REDUNDANT_EXPIRATION_
> > > PERSISTENT
> > > > that makes _all_ Redis regions of that type?
> > > >
> > > >
> > > >
> > > > On Feb 14, 2017, at 5:36 PM, Hitesh Khamesra <hitesh...@yahoo.com
> > <mailto:hitesh...@yahoo.com>
> > > <mailto:
> > > > hitesh...@yahoo.com<mailto:hitesh...@yahoo.com>>> wrote:
> > > >
> > > > Current GeodeRedisAdapter implementation is based on
> > > > https://cwiki.apache.org/confluence/display/GEODE/
> > > Geode+Redis+Adapter+Proposal
> > > > .
> > > > We are looking for some feedback on Redis commands and their mapping
> to
> > > > geode region.
> > > >
> > > > 1. Redis Type String
> > > >  a. Usage Set k1 v1
> > > >  b. Current implementation creates "STRING_REGION"
> > geode-partition-region
> > > > upfront
> > > >  c. This k1/v1 are geode-region key/value
> > > >  d. Any feedback?
> > > >
> > > > 2. List Type
> > > >  a. usage "rpush mylist A"
> > > >  b. Current implementation maps each list to
> > geode-partition-region(i.e.
> > > > mylist is geode-partition-region); with the ability to get item from
> > > > head/tail
> > > >  c. Feedback/vote
> > > >      -- List type operation at region-entry level;
> > > >      -- region-key = "mylist"
> > > >      -- region-value = Arraylist (will support all redis list ops)
> > > >  d. Feedback/vote: both behavior is desirable
> > > >
> > > >
> > > > 3. Hashes
> > > >  a. this represents field-value or java bean object
> > > >  b. usage "hmset user1000 username antirez birthyear 1977 verified 1"
> > > >  c. Current implementation maps each hashes to
> > > > geode-partition-region(i.e. user1000 is geode-partition-region)
> > > >  d. Feedback/vote
> > > >    -- Should we map hashes to region-entry
> > > >    -- region-key = user1000
> > > >    -- region-value = map
> > > >    -- This will provide java bean sort to behaviour with 10s of
> > > > field-value
> > > >    -- Personally I would prefer this..
> > > >  e. Feedback/vote: both behaviour is desirable
> > > >
> > > > 4. Sets
> > > >  a. This represents unique keys in set
> > > >  b. usage "sadd myset 1 2 3"
> > > >  c. Current implementation maps each sadd to
> > geode-partition-region(i.e.
> > > > myset is geode-partition-region)
> > > >  d. Feedback/vote
> > > >    -- Should we map set to region-entry
> > > >    -- region-key = myset
> > > >    -- region-value = Hashset
> > > >  e. Feedback/vote: both behaviour is desirable
> > > >
> > > > 5. SortedSets
> > > >  a. This represents unique keys in set with score (usecase Query
> > top-10)
> > > >  b. usage "zadd hackers 1940 "Alan Kay""
> > > >  c. Current implementation maps each zadd to
> > geode-partition-region(i.e.
> > > > hackers is geode-partition-region)
> > > >  d. Feedback/vote
> > > >    -- Should we map set to region-entry
> > > >    -- region-key = hackers
> > > >    -- region-value = Sorted Hashset
> > > >  e. Feedback/vote: both behaviour is desirable
> > > >
> > > > 6. HyperLogLogs
> > > >  a. A HyperLogLog is a probabilistic data structure used in order to
> > > > count unique things (technically this is referred to estimating the
> > > > cardinality of a set).
> > > >  b. usage "pfadd hll a b c d"
> > > >  c. Current implementation creates "HLL_REGION"
> geode-partition-region
> > > > upfront
> > > >  d. hll becomes region-key and value is HLL object
> > > >  e. any feedback?
> > > >
> > > > 7. Default config for geode-region (vote)
> > > >    a. partition region
> > > >    b. 1 redundant copy
> > > >    c. Persistence
> > > >    d. Eviction
> > > >    e. Expiration
> > > >    f. ?
> > > >
> > > > 8. It seems; redis knows type(list, hashes, string ,set ..) of each
> > key.
> > > > Thus for each operation we need to make sure type of key. In current
> > > > implementation we have different region for each redis type. Thus we
> > have
> > > > another region(metaTypeRegion) which keeps type for each key. This
> > makes
> > > > any operation in geode slow as it needs to verify that type. For
> > > instance,
> > > > creating new key need to make sure its already there or not. Whether
> we
> > > > should allow type change or not.
> > > >  a. Feedback/vote
> > > >      -- type change of key
> > > >      -- Can we allow two key with same name but two differnt type (as
> > it
> > > > will endup in two different geode-region)
> > > >        String type "key1" in string region
> > > >        HLL type "key1" in HLL region
> > > >  b. any other feedback
> > > >
> > > > 9. Transactions:
> > > >  a. we will not support transaction in redisAdapter as geode
> > transaction
> > > > are limited to single node.
> > > >  b. feedback?
> > > >
> > > > 10. Redis COMMAND (https://redis.io/commands/command)
> > > >  a. should we implement this "COMMAND" ?
> > > >
> > > > 11. Any other redis command we should consider?
> > > >
> > > >
> > > > Thanks.
> > > > Hitesh
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
>

Re: GeodeRedisAdapter improvments/feedback

Reply via email to