Yes this is only Option I am also thinking like this as my second options. Before this I was thinking to do denormalize table based on search columns, but due to partial search this will be not that effective.
Now suppose , if we are going with this single table as videos. and implemented with Solr/Lucene, then need to also care about num_tokens ? On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalo...@stratio.com> wrote: > Using cassandra collections > > CREATE TABLE videos ( > videoid uuid primary key, > title text, > actor list<text>, > producer list<text>, > release_date timestamp, > description text, > music text, > etc... > ); > > When using collection you need to take care of its length. Collections > are designed to store > <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only > a small amount of data > <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html> > . > 5/10 actors per movie is ok. > > > Eduardo Alonso > Vía de las dos Castillas, 33, Ática 4, 3ª Planta > 28224 Pozuelo de Alarcón, Madrid > Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // > *@stratiobd > <https://twitter.com/StratioBD>* > > 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: > >> So In short we have to go with one single table as videos and put primary >> key as videoid uuid. >> But then how can we able to handle multiple actor name and producer name. >> ? >> >> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso < >> eduardoalo...@stratio.com> wrote: >> >>> Yes, you are right. >>> >>> Table denormalization is useful just when you have unique primary keys, >>> not your case. >>> Denormalized tables are only different in its primary key, every >>> denormalized table contains all the data (it just change how it is >>> structured). So, if you need to index it, do it with just one table (the >>> one you showed us with videoid as the primary key is ok). >>> >>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and >>> all of them fulfill all your needs. >>> >>> Solr (in DSE) and cassandra-lucene-index >>> <https://github.com/stratio/cassandra-lucene-index> are very well >>> integrated with cassandra using its secondary index interface. If you >>> choose elastic search you will need to code the integration (write mutex, >>> both cluster synchronization (imagine something written in cassandra but >>> failed to write in elastic)) >>> >>> I know i am not the most suitable to recommend you to use our product >>> cassandra-lucene-index >>> <https://github.com/stratio/cassandra-lucene-index> but it is open >>> source, just take a look. >>> >>> Eduardo Alonso >>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >>> 28224 Pozuelo de Alarcón, Madrid >>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // >>> *@stratiobd >>> <https://twitter.com/StratioBD>* >>> >>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: >>> >>>> Hi Eduardo, >>>> >>>> And As we are trying to build an advanced search functionality in which >>>> we can able to do partial search based on actor, producer, director, etc. >>>> columns. >>>> So if we do denormalization of tables then we have to create tables >>>> such as below :- >>>> video_by_actor >>>> video_by_producer >>>> video_by_director >>>> video_by_date >>>> etc.. >>>> By using denormalized, Cassandra only allows us to do equality search, >>>> but for implementing Partial search we need to implement solr on all above >>>> tables. >>>> >>>> This is my thinking, but I think this will be not correct way to >>>> implement Apache Solr on all tables. >>>> >>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ < >>>> nandanpriyadarshi...@gmail.com> wrote: >>>> >>>>> Hi Edurado, >>>>> >>>>> As you mentioned queries 1-6 , >>>>> In this condition, we have to proceed with a table like as below :- >>>>> create table videos ( >>>>> videoid uuid primary key, >>>>> title text, >>>>> actor text, >>>>> producer text, >>>>> release_date timestamp, >>>>> description text, >>>>> music text, >>>>> etc... >>>>> ); >>>>> This table will help to store video datas based on PK videoid and will >>>>> give uniqeness due to uuid. >>>>> But as we know , in one movie there are multiple actor, multiple >>>>> producer, multiple music worked, So how can we store all these.. Only one >>>>> option will left as to use collection type columns. >>>>> >>>>> >>>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso < >>>>> eduardoalo...@stratio.com> wrote: >>>>> >>>>>> TLDR shouldBe *PD >>>>>> >>>>>> Eduardo Alonso >>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >>>>>> 28224 Pozuelo de Alarcón, Madrid >>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com >>>>>> // *@stratiobd <https://twitter.com/StratioBD>* >>>>>> >>>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com> >>>>>> : >>>>>> >>>>>>> Hi Nandan: >>>>>>> >>>>>>> So, your system must provide these queries: >>>>>>> >>>>>>> 1 - SELECT video FROM ... WHERE actor = '...'; >>>>>>> 2 - SELECT video FROM ... WHERE producer = '...'; >>>>>>> 3 - SELECT video FROM ... WHERE music = '...'; >>>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...'; >>>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...'; >>>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry'; >>>>>>> >>>>>>> >>>>>>> For queries 1-5 you can get them with just cassandra, denormalizing >>>>>>> tables just the way your mentioned but without solr, just cassandra >>>>>>> (Indeed, just for equality clauses) >>>>>>> >>>>>>> video_by_actor; >>>>>>> video_by_producer; >>>>>>> video_by_music; >>>>>>> video_by_actor_and_producer; >>>>>>> video_by_actor_and_music; >>>>>>> >>>>>>> For queries number 6 you need a search engine. >>>>>>> >>>>>>> SOL >>>>>>> ElasticSearch >>>>>>> cassandra-lucene-index >>>>>>> <https://github.com/stratio/cassandra-lucene-index> >>>>>>> SASI >>>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html> >>>>>>> >>>>>>> I think, just for your query, the easiest way to get it is to build >>>>>>> a SASI index. >>>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your >>>>>>> basic query (only one dimension), SASI indexes will work for you. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Eduardo Alonso >>>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >>>>>>> 28224 Pozuelo de Alarcón, Madrid >>>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com >>>>>>> // *@stratiobd <https://twitter.com/StratioBD>* >>>>>>> >>>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: >>>>>>> >>>>>>>> But Condition is , I am working with Apache Cassandra Database in >>>>>>>> which I have to store my data into Cassandra and then have to implement >>>>>>>> partial search capability. >>>>>>>> If we need to search based on full search primary key, then it >>>>>>>> really best and easy to work with Cassandra , but in case of flexible >>>>>>>> search , I am getting confused. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin < >>>>>>>> oskar.kjel...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I haven't run solr with Cassandra myself. I just meant to run >>>>>>>>> elasticsearch as a completely separate service and write there as >>>>>>>>> well. >>>>>>>>> >>>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Do you mean to use Elastic Search with Cassandra? >>>>>>>>> Even I am thinking to use Apache Solr With Cassandra. >>>>>>>>> In that case I have to create distributed tables such as:- >>>>>>>>> 1) video_by_title, video_by_actor, video_by_year etc.. >>>>>>>>> 2) After creating Tables , will have to configure solr core on all >>>>>>>>> tables. >>>>>>>>> >>>>>>>>> Is it like this ? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin < >>>>>>>>> oskar.kjel...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Why not elasticsearch for this use case? It will make your life >>>>>>>>>> much simpler >>>>>>>>>> >>>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ < >>>>>>>>>> nandanpriyadarshi...@gmail.com> wrote: >>>>>>>>>> > >>>>>>>>>> > Hi, >>>>>>>>>> > >>>>>>>>>> > Currently, I am working on data modeling for Video Company in >>>>>>>>>> which we have different types of users as well as different user >>>>>>>>>> functionality. >>>>>>>>>> > But currently, my concern is about Search video module based on >>>>>>>>>> different fields. >>>>>>>>>> > >>>>>>>>>> > Query patterns are as below:- >>>>>>>>>> > 1) Select video by actor. >>>>>>>>>> > 2) select video by producer. >>>>>>>>>> > 3) select video by music. >>>>>>>>>> > 4) select video by actor and producer. >>>>>>>>>> > 5) select video by actor and music. >>>>>>>>>> > >>>>>>>>>> > Note: - In short, We want to establish an advanced search >>>>>>>>>> module by which we can search by anyway and get the desired results. >>>>>>>>>> > >>>>>>>>>> > During a search , we need partial search also such that if any >>>>>>>>>> user can search "Harry" title, then we are able to give them result >>>>>>>>>> as all >>>>>>>>>> videos whose >>>>>>>>>> > title contains "Harry" at any location. >>>>>>>>>> > >>>>>>>>>> > As per my ideas, I have to create separate tables such as >>>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on >>>>>>>>>> all >>>>>>>>>> tables. Otherwise, >>>>>>>>>> > is there any others way by which we can implement this search >>>>>>>>>> module effectively. >>>>>>>>>> > >>>>>>>>>> > Please suggest. >>>>>>>>>> > >>>>>>>>>> > Best regards, >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >