bottom post
On 19/04/2014 12:46, R. Pasch wrote:
On 19-4-2014 9:38, Robin wrote:
Well, given that there are known limited attributes, this is the type
of application that really really suits a column oriented database,
such as Sybase IQ (now sold by SAP). Its a neat product that scales.
Great performance with drag'n'drop analytics.
Unless you can charm IQ out of SAP (it has been known to happen), you
might have to look at some other techniques
So consider some binary data representation
Red - 1 (0000 0001)
Orange - 2 (0000 0010)
Yellow - 4 (0000 0100)
Green - 8 (0000 1000)
Blue - 16 (0001 0000)
Indigo - 32 (0010 0000)
Violet - 64 (0100 0000)
This way, you can encode several colours in 1 value
Red or Green or Indigo = 1 + 8 + 32 = 41 = 0010 1001
Robin
I stopped reading when I heard the word "sold by SAP" ;-) This project
is solely build with open-source and freely available software.
I've been thinking about using a binary data representation but didn't
come to a solution to this specific problem quite yet. Per property of
a product, only one bit would be 1 and the rest would be 0. What would
a query look like to match all products that have a bit in the correct
position?
Say for instance these are a couple records (and yes, property values
can be null as well)
title, property1, property2, property3
================================
product1, 0000 0001, 0000 0010, NULL
product2, 0000 0100, 0100 0000, 0010 0000
product3, 0010 0000, 0010 0000, 0100 0000
Say that I would like to retrieve the products that either have
property1 as 0010 0000, 1000 000 or 0000 0001. Combined that would be
0010 1001 and would have to match product1 and product3 as they both
have their individual bit matching one of the bits being asked for.
What would a where statement look like using this type of binary
representation?
If that would be fairly simple to do and fast (most important factor)
then I could do an OR construction on all property columns and have
something count the amount of properties that actually matched. Is
that something you can do with a binary operator of some sort as well?
Count the amount of overlapping bits?
Say for instance I have a binary value of 0110 0101 and another binary
value of 1100 0100, how could I found out how many bits matched? (in
this case the number of matching bits would be 2)
I understand the reluctance to pay SAP-style rates, as a longtime DB
user, I have learned some 'charm' techniques.
However, I poked around a bit for alternatives, as I do like the
column-oriented approach, and found something called - /MonetDB /
<http://www.monetdb.com/Home/Features>- it apparently has a column-store
db kernel, and is open source - I suggest you have a look, if it does
what it says on the label, then it looks like a find.
There is a discussion of bitmask-trickiness here also dealing with
colours
<http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>
Robin
<http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>