Hi, Davey,

The "IN" functionality is already there.  It is for a field
represented by a (exclusive) radiobox.  The values could be either
numerical or text.  You can give it a try now.

The operations on bit field through bitwise logical operators '&' and
'|' is currently implemented.  For integer values such as IP
addresses, it is possible to say something like the following in the
arithmetic expressions

"ip & 0x1001 == 0x1001"

Please let us know if this is not working as it is expected.

As for the set valued attributes, since a number of different users
have mentioned it, we should be developing one some time soon.
However, this might be something in 6 months or a year..

John


On 7/9/12 12:52 AM, Lidawei (Davey) wrote:
> Hi John,
> 
> I want to separate your suggestion in two.
> 1. Realize Set-Value data type.
> 2. Realize HAS function.
> 
> Unfortunately, I have no idea about the research papers about searching in 
> Set-Valued attribute. It's too difficult for me ;-)
> 
> But you really give me a big hint. I think we should realize "IN" function 
> firstly rather than "HAS".
> 
> IN function is useful for some multi-value searching scenario, which can 
> significant reduce the recursion.
> For example, Imaging a database which is an automobile db. (This is a 
> different example, not user's preference about automobile, since user can 
> love both red and green. While a automobile model can only in one certain 
> color.)
> Then if I want to search my favorite automobile, at least I can replace 
> "(color=='red' OR color=='green' OR color=='blue') AND ..." 
> With
> "color IN('red', 'green', 'blue'......) AND ......"
> 
> This approach is helpful for make the condition clause support more 
> conditions.
> 
> But to achieve this goal, I still have some more questions:
> 1. IN function has variable amount of terms. I have study the code of 
> ibis::qExpr. I think it's easy to add a two terms function or three terms 
> function. But how to add a variable terms function?
> 2. All operators are hard code in qExpr. Even to add a two or three terms 
> function, the change will be everywhere of qExpr class. Is there any plug-in 
> framework to simplify the job? I think you should provide some plug-in API 
> for this.
> 3.IN function can only use for the attributes that not support composite 
> values. For example "red" and "green" are mutually exclusive. While "Sport" 
> and "Sunroof" is combinable. So your original suggestion still need to 
> considered.But to solve the problem, I think Fastbit need not realize a new 
> set-value searching arithmetic, since bitmap is already a flag of grouping. 
> Explain in the following.
> 
> In order to store a group of checkbox value in arbitrary combine, each 
> checkbox need a bit to store it's state. That means a set-value attribute for 
> all these checkbox is equivalent to a set of one bit value attributes.
> My idea is either realize a "Bit Stream" data type, which we can store 
> arbitrary bits, each stands for a checkbox, or realize a "Single 
> bit"(Boolean) data type, which stands for only one checkbox.
> 
> For "Bit stream" approach we also need provide operator to manipulate each 
> bit like "& | !" in C express.
> Then If I want to search find out my favorite automobile in fastbit database, 
> the condition clause can be written like this: "ap&5>0".
> Before generate this condition clause, Some preparation need to be make:
> Define binary value "0100" stands for "sports". Define binary value "0001" 
> stands for "sunroof".
> While Binary value "0101" can be conclude to stands for "sports and sunroof", 
> which is 5 in decimal.
> So "ap&5>0"means I care about Sports and sunroof, but no matter in other 
> characters.
> 
> "Bit stream" solution can reduce the usage of logic connector in express 
> since bit flag composite has already been calculated before searching. So it 
> can reduce recursion compare with "Single bit" solution.
> But the shortcoming of "Bit stream" solution is not intuitionist.
> 
> In my opinion:
> * Support "Single bit" data value and "eliminate recursion" is the best 
> choice.
> * Support "Bit Stream" data value and "bit operator & | !" is good.
> * Support "Bit operator" only but not support "Bit Stream" data value is also 
> acceptable. That means we can only use LONG as a 64 bit stream at most. but 
> much easy to realize.
> 
> Cheers,
> Davey
> 
> -----Original Message-----
> From: K. John Wu [mailto:[email protected]] 
> Sent: Friday, July 06, 2012 11:52 PM
> To: FastBit Users
> Cc: Lidawei (Davey)
> Subject: Re: [FastBit-users] Is it possible to eliminate the rescursion of 
> ibis::qExpr::simplify
> 
> Hi, Davey,
> 
> Thanks for the suggestions.  It might be hard to change the structure
> of ibis::qExpr::simply.  A systematic way of addressing the problem
> might be to develop a new expression to capture the some list of
> preferences.
> What you called booleans might actually be a representation of a set
> of check boxes, right?  If this is the case, then there is another
> term to describe the data, set-valued attribute.  A set-value
> attribute can be recorded as a single column in a data table, where
> the value at each row is a set taken from a list of predefined
> choices.  I have found a number of different research papers on
> indexing and querying set-valued attributes.  You probably can build
> an extension to FastBit to deal with set-valued attributes without too
> much trouble.
> 
> After implement an index for set-valued attributes, you can extend the
> query syntax to support querying over a long list of choices.  For
> example, if you have a group of checkboxes named "automobile
> preference" (ap for short), where the choices might including
> something like "sports", "race", "sedan", "wagon", "convertible",
> "sunroof", "green", "red", "white", "4-door", "2-door"..  You might
> invent a new keyword to express "finding all person who prefer red
> sports convertible that is not white" as follows
> 
> ap has ("sports", "red", "convertible") and NOT ap has ("white")
> 
> In between the parentheses, you can put in a long list, however,
> ibis::qExpre::simply will only see on single expression.  The keyword
> "HAS" functions somewhat like the keyword "IN" except that the
> properties given in the list must all be true (or present in the set
> of values).
> 
> Hope this helps.
> 
> John
> 
> 
> 
> 
> On 7/6/12 2:45 AM, Lidawei (Davey) wrote:
>> Hi authors of fastbit,
>>
>>  
>>
>> This is my first post. Thanks for your wonderful job!
>>
>>  
>>
>> Fastbit do help me solved a performance problem which need select
>> users from a huge user database with so many characters quickly.
>>
>> After that, I still have some idea to make Fastbit better. My topic is
>> about the complex condition support.
>>
>>  
>>
>> In our use case, we need select from billions of users which have
>> thousands of characters (chocolate lover for example), and we use a
>> very long condition clause (more than 1000 and/or items). All
>> character is a boolean value with yes/no
>>
>>  
>>
>> In my first attempt of fastbit, I found that unfortunately the
>> condition can't support more than 200 and/or items.
>>
>> After further study, I found that the reason is Stack overflow at
>> ibis::qExpr::simplify.
>>
>> This function is a recursion function. A very long condition make too
>> deep recursion which cause the stack exhaust.
>>
>>  
>>
>> Then I expand the stack size to 20M bytes. That make my program
>> support 10 thousands or/and items.
>>
>> The approach solved my problem but not elegant. The best solution is
>> eliminate the recursion of ibis::qExpr::simplify.
>>
>> I think It's a realizable. The refactoring can significant expand the
>> limit of Fastbit which make Fastbit more useful.
>>
>>  
>>
>> Can you consider this idea?
>>
>>  
>>
>> Another pity is no BOOLEAN data type support. Would you consider to
>> support it in future version?
>>
>>  
>>
>> cheers,
>>
>> Davey
>>
>>  
>>
>>
>>
>> _______________________________________________
>> FastBit-users mailing list
>> [email protected]
>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>
> 

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to