date:20140906

Re: Custom data-types

2014-09-06 Thread Alex De la rosa

Hi there,

Can somebody explain the use for custom search schemas? I still don't get
why would I want to have a custom schema if the default schema seems to be
able to get me the info of all the fields i have in my object.

Thanks!
Alex


On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com
wrote:

 Hi Sean,

 Seems I was wrong, that makes total sense now that you exposed it, looked
 a too good feature to me, but seems is not that easy.

 By the way, how does schemas really work for Riak Search? I went back
 and read the documentation but didn't see a real difference from using the
 default schema.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:

 Alex,

 In short, no, you can't create custom types through schemas. Schemas
 currently only refer to Riak Search 2.

 We would love that too, but it hasn't happened yet. The problem is not
 conceiving of a data type but making its behavior both sensible and
 convergent in the face of concurrent activity or network partitions.
 For instance, say that two tweets come in around the same time. Who
 goes first in the stack you described? How can multiple independent
 copies reason about which ones to drop from the bottom of the stack to
 keep it bounded to 100? What happens if a replica is separated from
 the others for a while and has really stale entries, is it valid to
 serve those to a user? What happens when one replica pushes an element
 and another one pops it at the same time?

 These sound like they might be trivial problems, but they are
 incredibly hard to reason about in the general case. You have to
 reason about the ordering of events, the scope of their effects, and
 decide on a least-surprising behavior to expose to the user. Although
 we have given a pretty familiar/friendly interface to the data types
 shipping in 2.0, their behavior is strictly different from the types
 you would use in a single-threaded program in local memory.

 On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
 alex.rosa@gmail.com wrote:
  Hi there,
 
  Correct me if I'm wrong, but I think I read somewhere that custom
 data-types
  can be created through schemas or something like that. So, apart from
  COUNTERS, SETS and MAPS we could have some custom defined ones.
 
  I would love to have a STACKS data-type that would work like a FIFO
 stack,
  so I could save the last 100 objects for some action. Imagine we are
  building Twitter where millions of tweets are sent all the time, but we
 want
  to quickly know the last 100 tweets for a user. Imagine something like:
 
  obj.stacks['last_tweets'].add(id_of_last_tweet)
 
  IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th
 goes
  out
 
  Is this possible? If so, how to do it?
 
  Thanks and Best Regards,
  Alex
 
  ___
  riak-users mailing list
  riak-users@lists.basho.com
  http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 



 --
 Sean Cribbs s...@basho.com
 Software Engineer
 Basho Technologies, Inc.
 http://basho.com/



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Custom data-types

2014-09-06 Thread Luke Bakken

Alex,

Custom schemas allow you to only index a subset of your object's data
(saving disk space). They also allow data type specification, field
copying (to have full-text search across your object easily), and
several other features.

The Solr documentation has more information here:

https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design

--
Luke Bakken
Engineer / CSE
lbak...@basho.com


On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com wrote:
 Hi there,

 Can somebody explain the use for custom search schemas? I still don't get
 why would I want to have a custom schema if the default schema seems to be
 able to get me the info of all the fields i have in my object.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa alex.rosa@gmail.com
 wrote:

 Hi Sean,

 Seems I was wrong, that makes total sense now that you exposed it, looked
 a too good feature to me, but seems is not that easy.

 By the way, how does schemas really work for Riak Search? I went back
 and read the documentation but didn't see a real difference from using the
 default schema.

 Thanks!
 Alex


 On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:

 Alex,

 In short, no, you can't create custom types through schemas. Schemas
 currently only refer to Riak Search 2.

 We would love that too, but it hasn't happened yet. The problem is not
 conceiving of a data type but making its behavior both sensible and
 convergent in the face of concurrent activity or network partitions.
 For instance, say that two tweets come in around the same time. Who
 goes first in the stack you described? How can multiple independent
 copies reason about which ones to drop from the bottom of the stack to
 keep it bounded to 100? What happens if a replica is separated from
 the others for a while and has really stale entries, is it valid to
 serve those to a user? What happens when one replica pushes an element
 and another one pops it at the same time?

 These sound like they might be trivial problems, but they are
 incredibly hard to reason about in the general case. You have to
 reason about the ordering of events, the scope of their effects, and
 decide on a least-surprising behavior to expose to the user. Although
 we have given a pretty familiar/friendly interface to the data types
 shipping in 2.0, their behavior is strictly different from the types
 you would use in a single-threaded program in local memory.

 On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
 alex.rosa@gmail.com wrote:
  Hi there,
 
  Correct me if I'm wrong, but I think I read somewhere that custom
  data-types
  can be created through schemas or something like that. So, apart from
  COUNTERS, SETS and MAPS we could have some custom defined ones.
 
  I would love to have a STACKS data-type that would work like a FIFO
  stack,
  so I could save the last 100 objects for some action. Imagine we are
  building Twitter where millions of tweets are sent all the time, but we
  want
  to quickly know the last 100 tweets for a user. Imagine something like:
 
  obj.stacks['last_tweets'].add(id_of_last_tweet)
 
  IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the 100th
  goes
  out
 
  Is this possible? If so, how to do it?
 
  Thanks and Best Regards,
  Alex
 
  ___
  riak-users mailing list
  riak-users@lists.basho.com
  http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 



 --
 Sean Cribbs s...@basho.com
 Software Engineer
 Basho Technologies, Inc.
 http://basho.com/




 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Custom data-types

2014-09-06 Thread Alex De la rosa

Hi Luke,

That seems useful :) will check the Solr documentation!

Thanks!
Alex

On Sat, Sep 6, 2014 at 4:16 PM, Luke Bakken lbak...@basho.com wrote:

Alex,

Custom schemas allow you to only index a subset of your object's data
(saving disk space). They also allow data type specification, field
copying (to have full-text search across your object easily), and
several other features.

The Solr documentation has more information here:

https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design

--
Luke Bakken
Engineer / CSE
lbak...@basho.com

On Sat, Sep 6, 2014 at 2:20 AM, Alex De la rosa alex.rosa@gmail.com
wrote:
Hi there,

Can somebody explain the use for custom search schemas? I still don't get
why would I want to have a custom schema if the default schema seems to
be
able to get me the info of all the fields i have in my object.

Thanks!
Alex

On Fri, Aug 29, 2014 at 4:48 PM, Alex De la rosa
alex.rosa@gmail.com
wrote:

Hi Sean,

Seems I was wrong, that makes total sense now that you exposed it,
looked
a too good feature to me, but seems is not that easy.

By the way, how does schemas really work for Riak Search? I went back
and read the documentation but didn't see a real difference from using
the
default schema.

Thanks!
Alex

On Fri, Aug 29, 2014 at 3:36 PM, Sean Cribbs s...@basho.com wrote:

Alex,

In short, no, you can't create custom types through schemas. Schemas
currently only refer to Riak Search 2.

We would love that too, but it hasn't happened yet. The problem is not
conceiving of a data type but making its behavior both sensible and
convergent in the face of concurrent activity or network partitions.
For instance, say that two tweets come in around the same time. Who
goes first in the stack you described? How can multiple independent
copies reason about which ones to drop from the bottom of the stack to
keep it bounded to 100? What happens if a replica is separated from
the others for a while and has really stale entries, is it valid to
serve those to a user? What happens when one replica pushes an element
and another one pops it at the same time?

These sound like they might be trivial problems, but they are
incredibly hard to reason about in the general case. You have to
reason about the ordering of events, the scope of their effects, and
decide on a least-surprising behavior to expose to the user. Although
we have given a pretty familiar/friendly interface to the data types
shipping in 2.0, their behavior is strictly different from the types
you would use in a single-threaded program in local memory.

On Thu, Aug 28, 2014 at 4:47 PM, Alex De la rosa
alex.rosa@gmail.com wrote:
Hi there,

Correct me if I'm wrong, but I think I read somewhere that custom
data-types
can be created through schemas or something like that. So, apart from
COUNTERS, SETS and MAPS we could have some custom defined ones.

I would love to have a STACKS data-type that would work like a FIFO
stack,
so I could save the last 100 objects for some action. Imagine we are
building Twitter where millions of tweets are sent all the time, but
we
want
to quickly know the last 100 tweets for a user. Imagine something
like:

obj.stacks['last_tweets'].add(id_of_last_tweet)

IN: last_tweet --- STACK_OF_100_TWEETS --- OUT: older than the
100th
goes
out

Is this possible? If so, how to do it?

Thanks and Best Regards,
Alex

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

--
Sean Cribbs s...@basho.com
Software Engineer
Basho Technologies, Inc.
http://basho.com/

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Custom data-types

Re: Custom data-types

Re: Custom data-types

3 matches

Site Navigation

Mail list logo

Footer information