Fwd: Indexing Big Data With or Without Solr
Your previous mail did not sent to mail list, I am forwarding. -- Forwarded message -- From: Vineet Mishra Date: 2014-05-06 14:33 GMT+03:00 Subject: Re: Indexing Big Data With or Without Solr To: Furkan KAMACI Hi Furkan, No not the metadata but I am planning to store sensor data to it fyi, http://www.freescale.com/webapp/sps/site/overview.jsp?code=SD_DATAFILEFORMAT this is how sensor data will look like, moreover you can think of # of Columns to be extended to 200 and # of rows to be around 1 Lakh, like this I will be having around 1 Lakhs different Files. Hardware spec. 6 Xeon Processor Machine 2.13 GHz, 16GB, HDD - Can extend, No issue. Thanks! On Tue, May 6, 2014 at 4:31 PM, Furkan KAMACI wrote: > Hi Vineet; > > I remove such kind of HTML tags and stop words (high frequency terms are > removed). However sensor data and web data has different characteristics, > you are right. Could you tell me what kind of information do you store at > each data (geolocation, name, description etc. etc.)? On the other hand > could you tell me more about your hardware infrastructure? > > Thanks; > Furkan KAMACI > > > 2014-05-06 13:40 GMT+03:00 Vineet Mishra : > > Hi Furkan, >> >> Indexing the document and indexing the raw digital sensor data is >> completely different, for your case a web document will have more of >> repeated tokens, like where you are having web pages then its quite obvious >> to have more of a repeating words like *div, span, title, style, etc. *This >> will be ideal case for Solr, as it gives the benefit of Inverted Index for >> your case but if you closely go through my requirement I have sensor data, >> moreover the data will be huge in size and hardly there are chances of >> repetition. >> >> What do you say, how will it be suitable? >> >> [Open Question - Expert advise needed] >> >> Thanks! >> >> >> >> On Tue, Apr 29, 2014 at 1:41 PM, Furkan KAMACI wrote: >> >>> Hi Vineet; >>> >>> Many millions of documents (web pages) that has an average response time >>> less than 10 ms. >>> >>> Thanks; >>> Furkan KAMACI >>> >>> >>> 2014-04-29 10:55 GMT+03:00 Vineet Mishra : >>> >>> Hi Furkan, >>>> >>>> Can you specify what type and size of data are you having? >>>> Moreover what is your index size and query response time. >>>> >>>> Thanks >>>> Vineet >>>> >>>> -- Forwarded message -- >>>> From: Furkan KAMACI >>>> Date: Tue, Apr 15, 2014 at 7:53 PM >>>> Subject: Re: Indexing Big Data With or Without Solr >>>> To: "solr-user@lucene.apache.org" >>>> >>>> >>>> Hi Vineet; >>>> >>>> I've been using SolrCloud for such kind of Big Data and I think that you >>>> should consider to use it. If you have any problems you can ask it here. >>>> >>>> Thanks; >>>> Furkan KAMACI >>>> >>>> >>>> 2014-04-15 13:20 GMT+03:00 Vineet Mishra : >>>> >>>> > Hi All, >>>> > >>>> > I have worked with Solr 3.5 to implement real time search on some >>>> 100GB >>>> > data, that worked fine but was little slow on complex queries(Multiple >>>> > group/joined queries). >>>> > But now I want to index some real Big Data(around 4 TB or even more), >>>> can >>>> > SolrCloud be solution for it if not what could be the best possible >>>> > solution in this case. >>>> > >>>> > *Stats for the previous Implementation:* >>>> > It was Master Slave Architecture with normal Standalone multiple >>>> instance >>>> > of Solr 3.5. There were around 12 Solr instance running on different >>>> > machines. >>>> > >>>> > *Things to consider for the next implementation:* >>>> > Since all the data is sensor data hence it is the factor of duplicity >>>> and >>>> > uniqueness. >>>> > >>>> > *Really urgent, please take the call on priority with set of feasible >>>> > solution.* >>>> > >>>> > Regards >>>> > >>>> >>>> >>> >> >
Re: Indexing Big Data With or Without Solr
mark. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Big-Data-With-or-Without-Solr-tp4131215p4133831.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing Big Data With or Without Solr
Thanks vineet With Regards Aman Tandon On Wed, Apr 23, 2014 at 7:21 PM, Vineet Mishra wrote: > I did it with Tomcat and Zookeeper Ensemble, will mail you the steps > shortly. > > Cheers > > > On Sat, Apr 19, 2014 at 9:09 AM, Aman Tandon >wrote: > > > Vineet please share after you setup for solr cloud > > Are you using jetty or tomcat.? > > > > On Saturday, April 19, 2014, Vineet Mishra > wrote: > > > Thanks Furkan, I will definitely give it a try then. > > > > > > Thanks again! > > > > > > > > > > > > > > > On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI > >wrote: > > > > > >> Hi Vineet; > > >> > > >> I've been using SolrCloud for such kind of Big Data and I think that > you > > >> should consider to use it. If you have any problems you can ask it > here. > > >> > > >> Thanks; > > >> Furkan KAMACI > > >> > > >> > > >> 2014-04-15 13:20 GMT+03:00 Vineet Mishra : > > >> > > >> > Hi All, > > >> > > > >> > I have worked with Solr 3.5 to implement real time search on some > > 100GB > > >> > data, that worked fine but was little slow on complex > queries(Multiple > > >> > group/joined queries). > > >> > But now I want to index some real Big Data(around 4 TB or even > more), > > can > > >> > SolrCloud be solution for it if not what could be the best possible > > >> > solution in this case. > > >> > > > >> > *Stats for the previous Implementation:* > > >> > It was Master Slave Architecture with normal Standalone multiple > > instance > > >> > of Solr 3.5. There were around 12 Solr instance running on different > > >> > machines. > > >> > > > >> > *Things to consider for the next implementation:* > > >> > Since all the data is sensor data hence it is the factor of > duplicity > > and > > >> > uniqueness. > > >> > > > >> > *Really urgent, please take the call on priority with set of > feasible > > >> > solution.* > > >> > > > >> > Regards > > >> > > > >> > > > > > > > -- > > Sent from Gmail Mobile > > >
Re: Indexing Big Data With or Without Solr
I did it with Tomcat and Zookeeper Ensemble, will mail you the steps shortly. Cheers On Sat, Apr 19, 2014 at 9:09 AM, Aman Tandon wrote: > Vineet please share after you setup for solr cloud > Are you using jetty or tomcat.? > > On Saturday, April 19, 2014, Vineet Mishra wrote: > > Thanks Furkan, I will definitely give it a try then. > > > > Thanks again! > > > > > > > > > > On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI >wrote: > > > >> Hi Vineet; > >> > >> I've been using SolrCloud for such kind of Big Data and I think that you > >> should consider to use it. If you have any problems you can ask it here. > >> > >> Thanks; > >> Furkan KAMACI > >> > >> > >> 2014-04-15 13:20 GMT+03:00 Vineet Mishra : > >> > >> > Hi All, > >> > > >> > I have worked with Solr 3.5 to implement real time search on some > 100GB > >> > data, that worked fine but was little slow on complex queries(Multiple > >> > group/joined queries). > >> > But now I want to index some real Big Data(around 4 TB or even more), > can > >> > SolrCloud be solution for it if not what could be the best possible > >> > solution in this case. > >> > > >> > *Stats for the previous Implementation:* > >> > It was Master Slave Architecture with normal Standalone multiple > instance > >> > of Solr 3.5. There were around 12 Solr instance running on different > >> > machines. > >> > > >> > *Things to consider for the next implementation:* > >> > Since all the data is sensor data hence it is the factor of duplicity > and > >> > uniqueness. > >> > > >> > *Really urgent, please take the call on priority with set of feasible > >> > solution.* > >> > > >> > Regards > >> > > >> > > > > -- > Sent from Gmail Mobile >
Re: Indexing Big Data With or Without Solr
Vineet please share after you setup for solr cloud Are you using jetty or tomcat.? On Saturday, April 19, 2014, Vineet Mishra wrote: > Thanks Furkan, I will definitely give it a try then. > > Thanks again! > > > > > On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI wrote: > >> Hi Vineet; >> >> I've been using SolrCloud for such kind of Big Data and I think that you >> should consider to use it. If you have any problems you can ask it here. >> >> Thanks; >> Furkan KAMACI >> >> >> 2014-04-15 13:20 GMT+03:00 Vineet Mishra : >> >> > Hi All, >> > >> > I have worked with Solr 3.5 to implement real time search on some 100GB >> > data, that worked fine but was little slow on complex queries(Multiple >> > group/joined queries). >> > But now I want to index some real Big Data(around 4 TB or even more), can >> > SolrCloud be solution for it if not what could be the best possible >> > solution in this case. >> > >> > *Stats for the previous Implementation:* >> > It was Master Slave Architecture with normal Standalone multiple instance >> > of Solr 3.5. There were around 12 Solr instance running on different >> > machines. >> > >> > *Things to consider for the next implementation:* >> > Since all the data is sensor data hence it is the factor of duplicity and >> > uniqueness. >> > >> > *Really urgent, please take the call on priority with set of feasible >> > solution.* >> > >> > Regards >> > >> > -- Sent from Gmail Mobile
Re: Indexing Big Data With or Without Solr
Thanks Furkan, I will definitely give it a try then. Thanks again! On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI wrote: > Hi Vineet; > > I've been using SolrCloud for such kind of Big Data and I think that you > should consider to use it. If you have any problems you can ask it here. > > Thanks; > Furkan KAMACI > > > 2014-04-15 13:20 GMT+03:00 Vineet Mishra : > > > Hi All, > > > > I have worked with Solr 3.5 to implement real time search on some 100GB > > data, that worked fine but was little slow on complex queries(Multiple > > group/joined queries). > > But now I want to index some real Big Data(around 4 TB or even more), can > > SolrCloud be solution for it if not what could be the best possible > > solution in this case. > > > > *Stats for the previous Implementation:* > > It was Master Slave Architecture with normal Standalone multiple instance > > of Solr 3.5. There were around 12 Solr instance running on different > > machines. > > > > *Things to consider for the next implementation:* > > Since all the data is sensor data hence it is the factor of duplicity and > > uniqueness. > > > > *Really urgent, please take the call on priority with set of feasible > > solution.* > > > > Regards > > >
Re: Indexing Big Data With or Without Solr
Hi Vineet; I've been using SolrCloud for such kind of Big Data and I think that you should consider to use it. If you have any problems you can ask it here. Thanks; Furkan KAMACI 2014-04-15 13:20 GMT+03:00 Vineet Mishra : > Hi All, > > I have worked with Solr 3.5 to implement real time search on some 100GB > data, that worked fine but was little slow on complex queries(Multiple > group/joined queries). > But now I want to index some real Big Data(around 4 TB or even more), can > SolrCloud be solution for it if not what could be the best possible > solution in this case. > > *Stats for the previous Implementation:* > It was Master Slave Architecture with normal Standalone multiple instance > of Solr 3.5. There were around 12 Solr instance running on different > machines. > > *Things to consider for the next implementation:* > Since all the data is sensor data hence it is the factor of duplicity and > uniqueness. > > *Really urgent, please take the call on priority with set of feasible > solution.* > > Regards >
Indexing Big Data With or Without Solr
Hi All, I have worked with Solr 3.5 to implement real time search on some 100GB data, that worked fine but was little slow on complex queries(Multiple group/joined queries). But now I want to index some real Big Data(around 4 TB or even more), can SolrCloud be solution for it if not what could be the best possible solution in this case. *Stats for the previous Implementation:* It was Master Slave Architecture with normal Standalone multiple instance of Solr 3.5. There were around 12 Solr instance running on different machines. *Things to consider for the next implementation:* Since all the data is sensor data hence it is the factor of duplicity and uniqueness. *Really urgent, please take the call on priority with set of feasible solution.* Regards