On Sun, Feb 20, 2011 at 08:37:04PM -0800, Roger Binns scratched on the wall:
> On 02/20/2011 06:23 PM, Jay A. Kreibich wrote:
> > On Sun, Feb 20, 2011 at 05:23:09PM -0800, Roger Binns scratched on the wall:
> >> If you want to use SQL then use Postfix.
> > 
> >   I might suggest PostgreSQL instead.
> >   (Sorry, Roger, I couldn't resist.)
> 
> Yeah, long night :-)  However, technically SQL over SMTP is possible and
> would actually work.  And if anyone is insane enough to try that then using
> Postfix and Postgres are a good combination.

  I was once forced to look at SOAP over SMTP.  Thankfully we were able
  to talk them out of it.  As if SOAP wasn't bad enough, I sure didn't
  want to mix it with SMTP.
  
  "Any networking protocol with 'Simple' in the name isn't."

> >> If you need lots of processes on the network to access data quickly then
> >> consider memcached.
> > 
> >   More seriously, in this category you might also consider Redis.
> >   Redis allows your data to have some structure, 
> 
> The Python binding pylibmc does structure the data for you automagically.

  Yes, but in something like memcached, the database is not aware of
  that structure, and can't take advantage of it.
  
  When storing serialized objects, it is all too common to see code that
  fetches an object, un-marshals it, alters some simple value, re-marshals
  the whole object, and then write the whole thing back.

  While this is fine for complex or less frequently accessed objects,
  where the ease of use outweighs the overhead, it seems like a
  lot of work for simpler items, such as lists and basic dictionaries
  or hashes.  Of course, these types of objects are also the most
  frequently modified data-types in many applications.  The best way
  to do key lists in memcached is practically a religious topic.

  Having the database aware of a few very simple and primitive structures 
  can provide significant improvements for these basic, most common
  operations.  For example, Redis can append an element to a list with
  a single O(1) database operation that is independent of list size,
  and the only payload data that crosses the network is the list key
  and the value to insert.  It has a few other tricks, like atomically
  incrementing counter values.

> >   plus it has the ability to persist the data to disk. 
> 
> The moment you talk about persistence you then have significant overlap with
> databases.  My personal favourite is MongoDB but there are loads of others
> such as Cassandra, HBase, Tokyo Cabinet etc.

  Yes and no.  Redis, like memcached, is essentially an always in-memory
  key/value store.  Functionally, it is much more like memcached than
  most of these other examples.  Its main selling point is memcached-like
  speed, without the "cache" aspects of memcached-- you're data stays
  there until you get rid of it, even across restarts.  In many cases,
  that also allows you to get rid of your backing database.  Redis also
  provides just enough internal structure to be useful, without
  really getting in the way, in much the same way that a very basic
  container library provides building-block tools without defining a
  whole class tree.

  Those features definitely bring it closer into the domain of NoSQL
  databases, but I'd argue that's only because memcached is so far
  removed from the rest of these.  Thanks to its caching nature, it
  can't really be considered a "database" at all (useful tool, yes;
  database, not really).


  Each of these products has its place.  I don't mean to sound like
  such a Redis fan-boy, but I've been messing around with it a lot
  lately, and found it to be both extremely simple to setup and
  configure (something most of these other products cannot claim),
  while also extremely useful.  You're not going to use Redis to
  replace something like Cassandra or HBase, but it is a good fit
  for situations where memcached is a good fit, but you want more
  structure and a known life-cycle to your data.

> What programming language are you using to implement the virtual tables?

  Like SQLite itself, I tend do all my virtual table modules in
  extremely vanilla C.
  
  I happen to think virtual tables are one of the more powerful features
  of SQLite, but also one of the most under-utilized features.  As a
  way to relax and explore, I sometimes write virtual table modules to
  bolt together odd data stores or storage formats.  Partly this is
  just for fun, but the process also helps develop a deep understanding
  of the native data model used by these different products, and that
  knowledge is useful for other work I do.  If it opens some eyes and
  helps promote virtual tables, and SQLite in general, that's also a
  great bonus.
 
  As such, much of the virtual table development work I do is somewhat
  isolated, outside of any environment or problem context.  My approach
  tends to be extremely general-purpose, since I'm often approaching
  the problem from a very high level.

  The big factor, for me, is that working in C means the virtual table
  can be deployed in almost any environment.  It can be packaged as a
  completely self-contained module, with almost no runtime dependencies.
  C modules are also easy to compile directly into the SQLite library,
  making them usable even in environments where dynamic linking isn't
  supported.

  Since a big part of writing these is to get them out for other people
  to use them, having them be easy to deploy and work in almost
  anywhere is nearly as important as having them function correctly.
  So I want things to work equally well if the application language is
  Python, Java, PHP, or even something a bit further out there, like
  Erlang.  Working in C avoids adding complexity, like someone working
  in Java wanting to use your MongoDB module.  I suppose it could be
  done, but I wouldn't want to be the one trying to make it all work.

  The cost, of course, is I tend to write a bit more code, but that's
  usually not a big deal.  My modules tend to be adapters, with little
  internal logic.  They're just not doing that much high-level data
  manipulation or coordination, so the savings that might be found in
  using a higher level language are not as significant as they might be
  if I were implementing complex algorithms or data manipulations.
  
  If I was working in a production environment, where someone was
  actually paying me to solve a specific problem in a specific
  environment, I wouldn't hesitate to choose a different language.
  Overall, I really like the idea of being able to do virtual tables in
  scripting language with higher level manipulations and massive
  library support.  I just have a different set of requirements and
  motivations for most of my work.

    -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to