Re: Custom data-types

2014-09-06 Thread Alex De la rosa
Hi there,

Can somebody explain the use for custom search schemas? I still don't get
why would I want to have a custom schema if the default schema seems to be
able to get me the info of all the fields i have in my object.

Thanks!
Alex


On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com
wrote:

 Hi Sean,

 Seems I was wrong, that makes total sense now that you exposed it, looked
 a too good feature to me, but seems is not that easy.

 By the way, how does schemas really work for Riak Search? I went back
 and read the documentation but didn't see a real difference from using the
 default schema.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:

 Alex,

 In short, no, you can't create custom types through schemas. Schemas
 currently only refer to Riak Search 2.

 We would love that too, but it hasn't happened yet. The problem is not
 conceiving of a data type but making its behavior both sensible and
 convergent in the face of concurrent activity or network partitions.
 For instance, say that two tweets come in around the same time. Who
 goes first in the stack you described? How can multiple independent
 copies reason about which ones to drop from the bottom of the stack to
 keep it bounded to 100? What happens if a replica is separated from
 the others for a while and has really stale entries, is it valid to
 serve those to a user? What happens when one replica pushes an element
 and another one pops it at the same time?

 These sound like they might be trivial problems, but they are
 incredibly hard to reason about in the general case. You have to
 reason about the ordering of events, the scope of their effects, and
 decide on a least-surprising behavior to expose to the user. Although
 we have given a pretty familiar/friendly interface to the data types
 shipping in 2.0, their behavior is strictly different from the types
 you would use in a single-threaded program in local memory.

 On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
 alex.rosa@gmail.com wrote:
  Hi there,
 
  Correct me if I'm wrong, but I think I read somewhere that custom
 data-types
  can be created through schemas or something like that. So, apart from
  COUNTERS, SETS and MAPS we could have some custom defined ones.
 
  I would love to have a STACKS data-type that would work like a FIFO
 stack,
  so I could save the last 100 objects for some action. Imagine we are
  building Twitter where millions of tweets are sent all the time, but we
 want
  to quickly know the last 100 tweets for a user. Imagine something like:
 
  obj.stacks['last_tweets'].add(id_of_last_tweet)
 
  IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th
 goes
  out
 
  Is this possible? If so, how to do it?
 
  Thanks and Best Regards,
  Alex
 
  ___
  riak-users mailing list
  riak-users@lists.basho.com
  http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 



 --
 Sean Cribbs s...@basho.com
 Software Engineer
 Basho Technologies, Inc.
 http://basho.com/



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Custom data-types

2014-09-06 Thread Luke Bakken
Alex,

Custom schemas allow you to only index a subset of your object's data
(saving disk space). They also allow data type specification, field
copying (to have full-text search across your object easily), and
several other features.

The Solr documentation has more information here:

https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design

--
Luke Bakken
Engineer / CSE
lbak...@basho.com


On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com wrote:
 Hi there,

 Can somebody explain the use for custom search schemas? I still don't get
 why would I want to have a custom schema if the default schema seems to be
 able to get me the info of all the fields i have in my object.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com
 wrote:

 Hi Sean,

 Seems I was wrong, that makes total sense now that you exposed it, looked
 a too good feature to me, but seems is not that easy.

 By the way, how does schemas really work for Riak Search? I went back
 and read the documentation but didn't see a real difference from using the
 default schema.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:

 Alex,

 In short, no, you can't create custom types through schemas. Schemas
 currently only refer to Riak Search 2.

 We would love that too, but it hasn't happened yet. The problem is not
 conceiving of a data type but making its behavior both sensible and
 convergent in the face of concurrent activity or network partitions.
 For instance, say that two tweets come in around the same time. Who
 goes first in the stack you described? How can multiple independent
 copies reason about which ones to drop from the bottom of the stack to
 keep it bounded to 100? What happens if a replica is separated from
 the others for a while and has really stale entries, is it valid to
 serve those to a user? What happens when one replica pushes an element
 and another one pops it at the same time?

 These sound like they might be trivial problems, but they are
 incredibly hard to reason about in the general case. You have to
 reason about the ordering of events, the scope of their effects, and
 decide on a least-surprising behavior to expose to the user. Although
 we have given a pretty familiar/friendly interface to the data types
 shipping in 2.0, their behavior is strictly different from the types
 you would use in a single-threaded program in local memory.

 On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
 alex.rosa@gmail.com wrote:
  Hi there,
 
  Correct me if I'm wrong, but I think I read somewhere that custom
  data-types
  can be created through schemas or something like that. So, apart from
  COUNTERS, SETS and MAPS we could have some custom defined ones.
 
  I would love to have a STACKS data-type that would work like a FIFO
  stack,
  so I could save the last 100 objects for some action. Imagine we are
  building Twitter where millions of tweets are sent all the time, but we
  want
  to quickly know the last 100 tweets for a user. Imagine something like:
 
  obj.stacks['last_tweets'].add(id_of_last_tweet)
 
  IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th
  goes
  out
 
  Is this possible? If so, how to do it?
 
  Thanks and Best Regards,
  Alex
 
  ___
  riak-users mailing list
  riak-users@lists.basho.com
  http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 



 --
 Sean Cribbs s...@basho.com
 Software Engineer
 Basho Technologies, Inc.
 http://basho.com/




 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Custom data-types

2014-09-06 Thread Alex De la rosa
Hi Luke,

That seems useful :) will check the Solr documentation!

Thanks!
Alex


On Sat, Sep 6, 2014 at 4:16 PM, Luke Bakken lbak...@basho.com wrote:

 Alex,

 Custom schemas allow you to only index a subset of your object's data
 (saving disk space). They also allow data type specification, field
 copying (to have full-text search across your object easily), and
 several other features.

 The Solr documentation has more information here:


 https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design

 --
 Luke Bakken
 Engineer / CSE
 lbak...@basho.com


 On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com
 wrote:
  Hi there,
 
  Can somebody explain the use for custom search schemas? I still don't get
  why would I want to have a custom schema if the default schema seems to
 be
  able to get me the info of all the fields i have in my object.
 
  Thanks!
  Alex
 
 
  On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa 
 alex.rosa@gmail.com
  wrote:
 
  Hi Sean,
 
  Seems I was wrong, that makes total sense now that you exposed it,
 looked
  a too good feature to me, but seems is not that easy.
 
  By the way, how does schemas really work for Riak Search? I went back
  and read the documentation but didn't see a real difference from using
 the
  default schema.
 
  Thanks!
  Alex
 
 
  On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:
 
  Alex,
 
  In short, no, you can't create custom types through schemas. Schemas
  currently only refer to Riak Search 2.
 
  We would love that too, but it hasn't happened yet. The problem is not
  conceiving of a data type but making its behavior both sensible and
  convergent in the face of concurrent activity or network partitions.
  For instance, say that two tweets come in around the same time. Who
  goes first in the stack you described? How can multiple independent
  copies reason about which ones to drop from the bottom of the stack to
  keep it bounded to 100? What happens if a replica is separated from
  the others for a while and has really stale entries, is it valid to
  serve those to a user? What happens when one replica pushes an element
  and another one pops it at the same time?
 
  These sound like they might be trivial problems, but they are
  incredibly hard to reason about in the general case. You have to
  reason about the ordering of events, the scope of their effects, and
  decide on a least-surprising behavior to expose to the user. Although
  we have given a pretty familiar/friendly interface to the data types
  shipping in 2.0, their behavior is strictly different from the types
  you would use in a single-threaded program in local memory.
 
  On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
  alex.rosa@gmail.com wrote:
   Hi there,
  
   Correct me if I'm wrong, but I think I read somewhere that custom
   data-types
   can be created through schemas or something like that. So, apart from
   COUNTERS, SETS and MAPS we could have some custom defined ones.
  
   I would love to have a STACKS data-type that would work like a FIFO
   stack,
   so I could save the last 100 objects for some action. Imagine we are
   building Twitter where millions of tweets are sent all the time, but
 we
   want
   to quickly know the last 100 tweets for a user. Imagine something
 like:
  
   obj.stacks['last_tweets'].add(id_of_last_tweet)
  
   IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the
 100th
   goes
   out
  
   Is this possible? If so, how to do it?
  
   Thanks and Best Regards,
   Alex
  
   ___
   riak-users mailing list
   riak-users@lists.basho.com
   http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
  
 
 
 
  --
  Sean Cribbs s...@basho.com
  Software Engineer
  Basho Technologies, Inc.
  http://basho.com/
 
 
 
 
  ___
  riak-users mailing list
  riak-users@lists.basho.com
  http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com