Re: Local database recommendation?

gaz jones Mon, 27 May 2013 08:23:34 -0700

Sqlite is worth a look. Never used it with the JVM, but I assume there is a
JDBC driver for it.



On Mon, May 27, 2013 at 1:01 AM, Zack Maril <[email protected]> wrote:

> Use postgres. If it makes sense later on, then try a nosql solution. Until
> then, postgres will probably do 95% of what you want out of the box.
> -Zack
>
>
> On Sunday, May 26, 2013 6:20:02 PM UTC-4, Amirouche Boubekki wrote:
>>
>>
>>  1) Is it structured aka. an object can have several fields possibly
>>>> complex fields like list or hashmaps but also integers ? dates and uuids
>>>> can be emulated with strings and integers
>>>> 2) Do objects have relations ? a lot of relations ?
>>>> 3) is the data schema fixed at compilation or do you need to have the
>>>> schema to be dynamic ?
>>>>
>>>
>>> Much of the data is conditional in a certain sense -- if it's an X, it's
>>> also a Y and it may be a W or a Z as well, but if it's a G it's certainly
>>> not a W, etc.; though simply storing a large number of boolean columns that
>>> may be unused by many of the table rows would be acceptable.
>>>
>>> The thing that makes me slightly dubious about relational here is that
>>> there will necessarily either be many columns unused by many rows, as
>>> there's a lot of data that's N/A unless certain other conditions are met;
>>> or else there will be many whole tables and a lot of expensive joins, as we
>>> have a table of Foos, with an isBar? column with either a BarID or a null,
>>> and a table of Bars with an isBaz? column, and a table of Bazzes with an
>>> isQuux? column, and then a need to do joins on *all* of those tables to run
>>> a query over a subset of Quuxes and have access to some Foo table columns
>>> in the results.
>>>
>>> This sort of thing points towards an object database more than any other
>>> sort, with inherited fields from superclasses, or a map database that
>>> performs well with lots of null/missing keys in most of the maps. But maybe
>>> a relational DB table with very many columns but relatively few used by any
>>> given row would perform OK.
>>>
>>
>> The only kind of object database that does ACID across documents on the
>> JVM I know of is Tinkerpop' Blueprints. Blueprints is an abstraction layer
>> on top of many graph databases among which Neo4j an OrientDB. The
>> difference between a graph database and an object database is that
>> «pointers» in a graph database are known at both ends. If you don't know
>> graph you will need to learn a bit of it. Basicaly, if A is connected to B,
>> B knows also about A being connected to it, which is not the case with a
>> pointer. Otherwise said, like in relationnal database, you can ask for «all
>> things connected to B» or «all things B connects to». The same query in an
>> object database will cost more. On top of that it's schemaless, like an
>> object database, but there is no notion of class, similar to what is found
>> OO programming (even if you can model the graph to have the concept of
>> classes).
>>
>>
>>>
>>> The DB must be able to grow larger then available RAM without crashing
>>>>> the JVM and the seqs resulting from queries like the above will also need
>>>>> to be able to get bigger than RAM.
>>>>>
>>>>
>>>>
>>>>> My own research suggests that H2 may be a good choice, but it's a
>>>>> standard SQL/relational DB and I'm not 100% sure that fits well with the
>>>>> type of data and querying noted above. Note though that not all querying
>>>>> will take that form; there'll also be strings, uuids, dates, and other 
>>>>> such
>>>>> field types and the need to query on these and to join on some of them;
>>>>> also, to do less-than comparisons on dates.
>>>>>
>>>>
>>>> Depending on your speed needs and the speed of the database, a kv store
>>>> can be enough, you serialize the data as strings and deserialize it when
>>>> you need to do computation. Except that kv store are not easy to deal with
>>>> when you have complex queries, but again it depends on the query.
>>>>
>>>
>>> I expect they'd also have problems with transactional integrity if, say,
>>> there was a power cut during an update. Anything involving "serialize the
>>> data as strings" sounds unsuited to either the volume I'm envisioning or
>>> the need for consistency. It certainly wouldn't do to overwrite the file
>>> with half of an updated version of itself and then lose power! Keeping the
>>> previous version around as a .bak file is scarcely much better. It pretty
>>> much needs to be ACID since there will need to be coordinated changes to
>>> more than one bit of the data sometimes and having an update interrupted
>>> with only half the changes done, and having it stay in that half-done
>>> state, would potentially be disastrous.
>>>
>>
>> At least unqlite is a embeddable kv store that is ACID across several
>> keys, you won't have data cut in half (based on what is  advertised), I
>> think berkley db is also transactional.
>>
>> Also I'm interested only in opensource software so there might be
>> proprietary softwares that solve you problem best, but I doubt that ;)
>>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Local database recommendation?

Reply via email to