Solr for real time analytics system

2016-02-03 Thread Rohit Kumar
Hi

I am quite new to Solr. I have to build a real time analytics system which
displays metrics based on multiple filters over a huge data set (~50million
documents with ~100 fileds ).  I would need mostly aggregation queries like
sum/average/groupby etc, but data set is quite huge. The aggregation
queries should be very fast.

Is Solr suitable for such use cases?

Thanks
Rohit


Re: Solr for real time analytics system

2016-02-03 Thread CKReddy Bhimavarapu
Hello Rohit,

You can use the Banana project which was forked from Kibana
, and works with all kinds of time
series (and non-time series) data stored in Apache Solr
. It uses Kibana's powerful dashboard
configuration capabilities, ports key panels to work with Solr, and
provides significant additional capabilities, including new panels that
leverage D3.js 

 would need mostly aggregation queries like sum/average/groupby etc, but
> data set is quite huge. The aggregation queries should be very fast.


all your requirement can be served by this banana but I'm not sure about
how fast solr compare to ELK 

On Thu, Feb 4, 2016 at 10:51 AM, Rohit Kumar 
wrote:

> Hi
>
> I am quite new to Solr. I have to build a real time analytics system which
> displays metrics based on multiple filters over a huge data set (~50million
> documents with ~100 fileds ).  I would need mostly aggregation queries like
> sum/average/groupby etc, but data set is quite huge. The aggregation
> queries should be very fast.
>
> Is Solr suitable for such use cases?
>
> Thanks
> Rohit
>



-- 
ckreddybh. 


Re: Solr for real time analytics system

2016-02-04 Thread Arkadiusz Robiński
A few people did a real time analytics system with solr and talked about it
at conferences. Maybe you'll find their presentations useful:
https://www.youtube.com/results?search_query=solr%20real%20time%20analytics&oq=&gs_l=
(esp. the first one: https://www.youtube.com/watch?v=PkoyCxBXAiA )

On Thu, Feb 4, 2016 at 8:25 AM, Rohit Kumar 
wrote:

> Thanks Bhimavarapu for the information.
>
> We are creating our own dashboard, so probably wont need kibana/banana. I
> was more curious about Solr support for fast aggregation query over very
> large data set. As suggested, I guess elasticsearch  has this capability.
> Is there any published metrics or data regarding elasticsearch/solr
> performance in this area that I can refer to?
>
> Thanks
> Rohit
>
>
>
> On Thu, Feb 4, 2016 at 11:48 AM, CKReddy Bhimavarapu 
> wrote:
>
> > Hello Rohit,
> >
> > You can use the Banana project which was forked from Kibana
> > , and works with all kinds of time
> > series (and non-time series) data stored in Apache Solr
> > . It uses Kibana's powerful dashboard
> > configuration capabilities, ports key panels to work with Solr, and
> > provides significant additional capabilities, including new panels that
> > leverage D3.js 
> >
> >  would need mostly aggregation queries like sum/average/groupby etc, but
> > > data set is quite huge. The aggregation queries should be very fast.
> >
> >
> > all your requirement can be served by this banana but I'm not sure about
> > how fast solr compare to ELK 
> >
> > On Thu, Feb 4, 2016 at 10:51 AM, Rohit Kumar <
> > rohitkumarbhagat...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > I am quite new to Solr. I have to build a real time analytics system
> > which
> > > displays metrics based on multiple filters over a huge data set
> > (~50million
> > > documents with ~100 fileds ).  I would need mostly aggregation queries
> > like
> > > sum/average/groupby etc, but data set is quite huge. The aggregation
> > > queries should be very fast.
> > >
> > > Is Solr suitable for such use cases?
> > >
> > > Thanks
> > > Rohit
> > >
> >
> >
> >
> > --
> > ckreddybh. 
> >
>



-- 
Arkadiusz Robiński
Software Developer
Otodom.pl


Re: Solr for real time analytics system

2016-02-04 Thread Rohit Kumar
Thanks Bhimavarapu for the information.

We are creating our own dashboard, so probably wont need kibana/banana. I
was more curious about Solr support for fast aggregation query over very
large data set. As suggested, I guess elasticsearch  has this capability.
Is there any published metrics or data regarding elasticsearch/solr
performance in this area that I can refer to?

Thanks
Rohit



On Thu, Feb 4, 2016 at 11:48 AM, CKReddy Bhimavarapu 
wrote:

> Hello Rohit,
>
> You can use the Banana project which was forked from Kibana
> , and works with all kinds of time
> series (and non-time series) data stored in Apache Solr
> . It uses Kibana's powerful dashboard
> configuration capabilities, ports key panels to work with Solr, and
> provides significant additional capabilities, including new panels that
> leverage D3.js 
>
>  would need mostly aggregation queries like sum/average/groupby etc, but
> > data set is quite huge. The aggregation queries should be very fast.
>
>
> all your requirement can be served by this banana but I'm not sure about
> how fast solr compare to ELK 
>
> On Thu, Feb 4, 2016 at 10:51 AM, Rohit Kumar <
> rohitkumarbhagat...@gmail.com>
> wrote:
>
> > Hi
> >
> > I am quite new to Solr. I have to build a real time analytics system
> which
> > displays metrics based on multiple filters over a huge data set
> (~50million
> > documents with ~100 fileds ).  I would need mostly aggregation queries
> like
> > sum/average/groupby etc, but data set is quite huge. The aggregation
> > queries should be very fast.
> >
> > Is Solr suitable for such use cases?
> >
> > Thanks
> > Rohit
> >
>
>
>
> --
> ckreddybh. 
>


Re: Solr for real time analytics system

2016-02-04 Thread Susheel Kumar
Hi Rohit,

Please take a loot at Streaming expressions & Parallel SQL Interface.  That
should meet many of your analytics requirement (aggregation queries like
sum/average/groupby etc).
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
https://cwiki.apache.org/confluence/display/solr/Parallel+SQL+Interface

Thanks,
Susheel

On Thu, Feb 4, 2016 at 3:17 AM, Arkadiusz Robiński <
arkadiusz.robin...@otodom.pl> wrote:

> A few people did a real time analytics system with solr and talked about it
> at conferences. Maybe you'll find their presentations useful:
>
> https://www.youtube.com/results?search_query=solr%20real%20time%20analytics&oq=&gs_l=
> (esp. the first one: https://www.youtube.com/watch?v=PkoyCxBXAiA )
>
> On Thu, Feb 4, 2016 at 8:25 AM, Rohit Kumar  >
> wrote:
>
> > Thanks Bhimavarapu for the information.
> >
> > We are creating our own dashboard, so probably wont need kibana/banana. I
> > was more curious about Solr support for fast aggregation query over very
> > large data set. As suggested, I guess elasticsearch  has this capability.
> > Is there any published metrics or data regarding elasticsearch/solr
> > performance in this area that I can refer to?
> >
> > Thanks
> > Rohit
> >
> >
> >
> > On Thu, Feb 4, 2016 at 11:48 AM, CKReddy Bhimavarapu <
> chaitu...@gmail.com>
> > wrote:
> >
> > > Hello Rohit,
> > >
> > > You can use the Banana project which was forked from Kibana
> > > , and works with all kinds of time
> > > series (and non-time series) data stored in Apache Solr
> > > . It uses Kibana's powerful dashboard
> > > configuration capabilities, ports key panels to work with Solr, and
> > > provides significant additional capabilities, including new panels that
> > > leverage D3.js 
> > >
> > >  would need mostly aggregation queries like sum/average/groupby etc,
> but
> > > > data set is quite huge. The aggregation queries should be very fast.
> > >
> > >
> > > all your requirement can be served by this banana but I'm not sure
> about
> > > how fast solr compare to ELK 
> > >
> > > On Thu, Feb 4, 2016 at 10:51 AM, Rohit Kumar <
> > > rohitkumarbhagat...@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > I am quite new to Solr. I have to build a real time analytics system
> > > which
> > > > displays metrics based on multiple filters over a huge data set
> > > (~50million
> > > > documents with ~100 fileds ).  I would need mostly aggregation
> queries
> > > like
> > > > sum/average/groupby etc, but data set is quite huge. The aggregation
> > > > queries should be very fast.
> > > >
> > > > Is Solr suitable for such use cases?
> > > >
> > > > Thanks
> > > > Rohit
> > > >
> > >
> > >
> > >
> > > --
> > > ckreddybh. 
> > >
> >
>
>
>
> --
> Arkadiusz Robiński
> Software Developer
> Otodom.pl
>