Yes this is only Option I am also thinking like this as my second options.
Before this I was thinking to do denormalize table based on search columns,
but due to partial search this will be not that effective.

Now suppose , if we are going with this single table as videos. and
implemented with Solr/Lucene, then need to also care about num_tokens ?


On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalo...@stratio.com>
wrote:

> Using cassandra collections
>
> CREATE TABLE videos (
> videoid uuid primary key,
> title text,
> actor list<text>,
> producer list<text>,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
>
> When using collection you need to take care of its length. Collections
> are designed to store
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
> a small amount of data
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
> .
> 5/10 actors per movie is ok.
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
> *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>
>> So In short we have to go with one single table as videos and put primary
>> key as videoid uuid.
>> But then how can we able to handle multiple actor name and producer name.
>> ?
>>
>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>> eduardoalo...@stratio.com> wrote:
>>
>>> Yes, you are right.
>>>
>>> Table denormalization is useful just when you have unique primary keys,
>>> not your case.
>>> Denormalized tables are only different in its primary key, every
>>> denormalized table contains all the data (it just change how it is
>>> structured). So, if you need to index it, do it with just one table (the
>>> one you showed us with videoid as the primary key is ok).
>>>
>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>> all of them fulfill all your needs.
>>>
>>> Solr (in DSE) and cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>> integrated with cassandra using its secondary index interface. If you
>>> choose elastic search you will need to code the integration (write mutex,
>>> both cluster synchronization (imagine something written in cassandra but
>>> failed to write in elastic))
>>>
>>> I know i am not the most suitable to recommend you to use our product
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>> source, just take a look.
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>> *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>
>>>> Hi Eduardo,
>>>>
>>>> And As we are trying to build an advanced search functionality in which
>>>> we can able to do partial search based on actor, producer, director, etc.
>>>> columns.
>>>> So if we do denormalization of tables then we have to create tables
>>>> such as below :-
>>>> video_by_actor
>>>> video_by_producer
>>>> video_by_director
>>>> video_by_date
>>>> etc..
>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>> but for implementing Partial search we need to implement solr on all above
>>>> tables.
>>>>
>>>> This is my thinking, but I think this will be not correct way to
>>>> implement Apache Solr on all tables.
>>>>
>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>> nandanpriyadarshi...@gmail.com> wrote:
>>>>
>>>>> Hi Edurado,
>>>>>
>>>>> As you mentioned queries 1-6 ,
>>>>> In this condition, we have to proceed with a table like as below :-
>>>>> create table videos (
>>>>> videoid uuid primary key,
>>>>> title text,
>>>>> actor text,
>>>>> producer text,
>>>>> release_date timestamp,
>>>>> description text,
>>>>> music text,
>>>>> etc...
>>>>> );
>>>>> This table will help to store video datas based on PK videoid and will
>>>>> give uniqeness due to uuid.
>>>>> But as we know , in one movie there are multiple actor, multiple
>>>>> producer, multiple music worked, So how can we store all these.. Only one
>>>>> option will left as to use collection type columns.
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>>>> eduardoalo...@stratio.com> wrote:
>>>>>
>>>>>> TLDR shouldBe *PD
>>>>>>
>>>>>> Eduardo Alonso
>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>
>>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com>
>>>>>> :
>>>>>>
>>>>>>> Hi Nandan:
>>>>>>>
>>>>>>> So, your system must provide these queries:
>>>>>>>
>>>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>>>
>>>>>>>
>>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>>>> (Indeed, just for equality clauses)
>>>>>>>
>>>>>>> video_by_actor;
>>>>>>> video_by_producer;
>>>>>>> video_by_music;
>>>>>>> video_by_actor_and_producer;
>>>>>>> video_by_actor_and_music;
>>>>>>>
>>>>>>> For queries number 6 you need a search engine.
>>>>>>>
>>>>>>> SOL
>>>>>>> ElasticSearch
>>>>>>> cassandra-lucene-index
>>>>>>> <https://github.com/stratio/cassandra-lucene-index>
>>>>>>> SASI
>>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>>>>>
>>>>>>> I think, just for your query,  the easiest way to get it is to build
>>>>>>> a SASI index.
>>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your
>>>>>>> basic query (only one dimension), SASI indexes will work for you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Eduardo Alonso
>>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>>>> 28224 Pozuelo de Alarcón, Madrid
>>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com
>>>>>>>  // *@stratiobd <https://twitter.com/StratioBD>*
>>>>>>>
>>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>>>>>
>>>>>>>> But Condition is , I am working with Apache Cassandra Database in
>>>>>>>> which I have to store my data into Cassandra and then have to implement
>>>>>>>> partial search capability.
>>>>>>>> If we need to search based on full search  primary key, then it
>>>>>>>> really best and easy to work with Cassandra , but in case of flexible
>>>>>>>> search , I am getting confused.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <
>>>>>>>> oskar.kjel...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>>>>>> elasticsearch as a completely separate service and write there as 
>>>>>>>>> well.
>>>>>>>>>
>>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>>>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>>>>>>> In that case I have to create distributed tables such as:-
>>>>>>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>>>>>>> 2) After creating Tables , will have to configure solr core on all
>>>>>>>>> tables.
>>>>>>>>>
>>>>>>>>> Is it like this ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <
>>>>>>>>> oskar.kjel...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Why not elasticsearch for this use case? It will make your life
>>>>>>>>>> much simpler
>>>>>>>>>>
>>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <
>>>>>>>>>> nandanpriyadarshi...@gmail.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi,
>>>>>>>>>> >
>>>>>>>>>> > Currently, I am working on data modeling for Video Company in
>>>>>>>>>> which we have different types of users as well as different user
>>>>>>>>>> functionality.
>>>>>>>>>> > But currently, my concern is about Search video module based on
>>>>>>>>>> different fields.
>>>>>>>>>> >
>>>>>>>>>> > Query patterns are as below:-
>>>>>>>>>> > 1) Select video by actor.
>>>>>>>>>> > 2) select video by producer.
>>>>>>>>>> > 3) select video by music.
>>>>>>>>>> > 4) select video by actor and producer.
>>>>>>>>>> > 5) select video by actor and music.
>>>>>>>>>> >
>>>>>>>>>> > Note: - In short, We want to establish an advanced search
>>>>>>>>>> module by which we can search by anyway and get the desired results.
>>>>>>>>>> >
>>>>>>>>>> > During a search , we need partial search also such that if any
>>>>>>>>>> user can search "Harry" title, then we are able to give them result 
>>>>>>>>>> as all
>>>>>>>>>> videos whose
>>>>>>>>>> >  title contains "Harry" at any location.
>>>>>>>>>> >
>>>>>>>>>> > As per my ideas, I have to create separate tables such as
>>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on 
>>>>>>>>>> all
>>>>>>>>>> tables. Otherwise,
>>>>>>>>>> > is there any others way by which we can implement this search
>>>>>>>>>> module effectively.
>>>>>>>>>> >
>>>>>>>>>> > Please suggest.
>>>>>>>>>> >
>>>>>>>>>> > Best regards,
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to