Re: Custom data-types
Hi there, Can somebody explain the use for custom search schemas? I still don't get why would I want to have a custom schema if the default schema seems to be able to get me the info of all the fields i have in my object. Thanks! Alex On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Sean, Seems I was wrong, that makes total sense now that you exposed it, looked a too good feature to me, but seems is not that easy. By the way, how does schemas really work for Riak Search? I went back and read the documentation but didn't see a real difference from using the default schema. Thanks! Alex On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote: Alex, In short, no, you can't create custom types through schemas. Schemas currently only refer to Riak Search 2. We would love that too, but it hasn't happened yet. The problem is not conceiving of a data type but making its behavior both sensible and convergent in the face of concurrent activity or network partitions. For instance, say that two tweets come in around the same time. Who goes first in the stack you described? How can multiple independent copies reason about which ones to drop from the bottom of the stack to keep it bounded to 100? What happens if a replica is separated from the others for a while and has really stale entries, is it valid to serve those to a user? What happens when one replica pushes an element and another one pops it at the same time? These sound like they might be trivial problems, but they are incredibly hard to reason about in the general case. You have to reason about the ordering of events, the scope of their effects, and decide on a least-surprising behavior to expose to the user. Although we have given a pretty familiar/friendly interface to the data types shipping in 2.0, their behavior is strictly different from the types you would use in a single-threaded program in local memory. On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi there, Correct me if I'm wrong, but I think I read somewhere that custom data-types can be created through schemas or something like that. So, apart from COUNTERS, SETS and MAPS we could have some custom defined ones. I would love to have a STACKS data-type that would work like a FIFO stack, so I could save the last 100 objects for some action. Imagine we are building Twitter where millions of tweets are sent all the time, but we want to quickly know the last 100 tweets for a user. Imagine something like: obj.stacks['last_tweets'].add(id_of_last_tweet) IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th goes out Is this possible? If so, how to do it? Thanks and Best Regards, Alex ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Custom data-types
Alex, Custom schemas allow you to only index a subset of your object's data (saving disk space). They also allow data type specification, field copying (to have full-text search across your object easily), and several other features. The Solr documentation has more information here: https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design -- Luke Bakken Engineer / CSE lbak...@basho.com On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com wrote: Hi there, Can somebody explain the use for custom search schemas? I still don't get why would I want to have a custom schema if the default schema seems to be able to get me the info of all the fields i have in my object. Thanks! Alex On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Sean, Seems I was wrong, that makes total sense now that you exposed it, looked a too good feature to me, but seems is not that easy. By the way, how does schemas really work for Riak Search? I went back and read the documentation but didn't see a real difference from using the default schema. Thanks! Alex On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote: Alex, In short, no, you can't create custom types through schemas. Schemas currently only refer to Riak Search 2. We would love that too, but it hasn't happened yet. The problem is not conceiving of a data type but making its behavior both sensible and convergent in the face of concurrent activity or network partitions. For instance, say that two tweets come in around the same time. Who goes first in the stack you described? How can multiple independent copies reason about which ones to drop from the bottom of the stack to keep it bounded to 100? What happens if a replica is separated from the others for a while and has really stale entries, is it valid to serve those to a user? What happens when one replica pushes an element and another one pops it at the same time? These sound like they might be trivial problems, but they are incredibly hard to reason about in the general case. You have to reason about the ordering of events, the scope of their effects, and decide on a least-surprising behavior to expose to the user. Although we have given a pretty familiar/friendly interface to the data types shipping in 2.0, their behavior is strictly different from the types you would use in a single-threaded program in local memory. On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi there, Correct me if I'm wrong, but I think I read somewhere that custom data-types can be created through schemas or something like that. So, apart from COUNTERS, SETS and MAPS we could have some custom defined ones. I would love to have a STACKS data-type that would work like a FIFO stack, so I could save the last 100 objects for some action. Imagine we are building Twitter where millions of tweets are sent all the time, but we want to quickly know the last 100 tweets for a user. Imagine something like: obj.stacks['last_tweets'].add(id_of_last_tweet) IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th goes out Is this possible? If so, how to do it? Thanks and Best Regards, Alex ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Custom data-types
Hi Luke, That seems useful :) will check the Solr documentation! Thanks! Alex On Sat, Sep 6, 2014 at 4:16 PM, Luke Bakken lbak...@basho.com wrote: Alex, Custom schemas allow you to only index a subset of your object's data (saving disk space). They also allow data type specification, field copying (to have full-text search across your object easily), and several other features. The Solr documentation has more information here: https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design -- Luke Bakken Engineer / CSE lbak...@basho.com On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com wrote: Hi there, Can somebody explain the use for custom search schemas? I still don't get why would I want to have a custom schema if the default schema seems to be able to get me the info of all the fields i have in my object. Thanks! Alex On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Sean, Seems I was wrong, that makes total sense now that you exposed it, looked a too good feature to me, but seems is not that easy. By the way, how does schemas really work for Riak Search? I went back and read the documentation but didn't see a real difference from using the default schema. Thanks! Alex On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote: Alex, In short, no, you can't create custom types through schemas. Schemas currently only refer to Riak Search 2. We would love that too, but it hasn't happened yet. The problem is not conceiving of a data type but making its behavior both sensible and convergent in the face of concurrent activity or network partitions. For instance, say that two tweets come in around the same time. Who goes first in the stack you described? How can multiple independent copies reason about which ones to drop from the bottom of the stack to keep it bounded to 100? What happens if a replica is separated from the others for a while and has really stale entries, is it valid to serve those to a user? What happens when one replica pushes an element and another one pops it at the same time? These sound like they might be trivial problems, but they are incredibly hard to reason about in the general case. You have to reason about the ordering of events, the scope of their effects, and decide on a least-surprising behavior to expose to the user. Although we have given a pretty familiar/friendly interface to the data types shipping in 2.0, their behavior is strictly different from the types you would use in a single-threaded program in local memory. On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi there, Correct me if I'm wrong, but I think I read somewhere that custom data-types can be created through schemas or something like that. So, apart from COUNTERS, SETS and MAPS we could have some custom defined ones. I would love to have a STACKS data-type that would work like a FIFO stack, so I could save the last 100 objects for some action. Imagine we are building Twitter where millions of tweets are sent all the time, but we want to quickly know the last 100 tweets for a user. Imagine something like: obj.stacks['last_tweets'].add(id_of_last_tweet) IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th goes out Is this possible? If so, how to do it? Thanks and Best Regards, Alex ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com