Re: is commit a sequential process in solr indexing

2012-05-22 Thread findbestopensource
Yes. Lucene / Solr supports multi threaded environment. You could do commit from two different threads to same core or different core. Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 12:35 AM, jame vaalet wrote: > hi, > my use case here is to search all the incoming documents

Re: System requirements in my case?

2012-05-22 Thread findbestopensource
www.findbestopensource.com On Tue, May 22, 2012 at 2:36 PM, Bruno Mannina wrote: > My choice: > http://www.ovh.com/fr/**serveurs_dedies/eg_best_of.xml<http://www.ovh.com/fr/serveurs_dedies/eg_best_of.xml> > > 24 Go DDR3 > > Le 22/05/2012 10:26, findbestopensource a écrit : > > Dedica

Re: Multicore Solr

2012-05-22 Thread findbestopensource
Having cores per user is not good idea. The count is too high. Keep everything in single core. You could filter the data based on user name or user id. Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 2:29 PM, Shanu Jha wrote: > Hi all, > > greetings from my end. This is my f

Re: Strategy for maintaining De-normalized indexes

2012-05-22 Thread findbestopensource
Thats how de-normalization works. You need to update all child products. If you just need the count and you are using facets then maintain a map between category and main product, main product and child product. Lucene db has no schema. You could retrieve the data based on its type. Category reco

Re: System requirements in my case?

2012-05-22 Thread findbestopensource
Dedicated Server may not be required. If you want to cut down cost, then prefer shared server. How much the RAM? Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 12:36 PM, Bruno Mannina wrote: > Dear Solr users, > > My company would like to use solr to index around 80 000 000

Re: Fault tolerant Solr replication architecture

2012-05-21 Thread findbestopensource
Hi Parvin, Fault tolerant architecture is something you need to decide on your requirement. At some point of time there may require some manual intervention to recover from crash. You need to see how much percentage you could support fault tolerant. It certainly may not be 100. We could handle sit

Re: curl or nutch

2012-05-16 Thread findbestopensource
You could very well use Solr. It has support to index the PDF and XML files. If you want to index websites and search using page rank then choose Nutch. Regards Aditya www.findbestopensource.com On Wed, May 16, 2012 at 1:13 PM, Tolga wrote: > Hi, > > I have been trying for a week. I really wan

Re: authentication for solr admin page?

2012-05-15 Thread findbestopensource
I have written an article on this. The various steps to restrict / authenticate Solr admin interface. http://www.findbestopensource.com/article-detail/restrict-solr-admin-access Regards Aditya www.findbestopensource.com On Thu, Mar 29, 2012 at 1:06 AM, geeky2 wrote: > update - > > ok - i was

Re: Search Issue

2012-01-11 Thread findbestopensource
While indexing @ is removed. You need to use your own Tokenizer which will consider "@rohit" as one word. Another option is to break the tweet in to two fields, @ and the tweet. Index both the fields but don't use any tokenizer for the field " @". Just index as it is. While querying you need to se

Large data set or data corpus

2012-01-11 Thread findbestopensource
Hello all, Recently i saw couple of discussions in LinkedIn group about generating large data set or data corpus. I have compiled the same in to an article. Hope it would be helpful. If you have any other links where we could get large data set for free, please reply to this mail thread, i will up

Re: How can i use Solr based Search Engine for My University?

2011-05-06 Thread findbestopensource
Hello Anurag Google is always there to do internet search. You need to support search for your university. My opinion would be don't crawl the sites. You require only Solr and not Nutch. 1. Provide an interface to upload the documents by the university students. The documents could be previous ye

Re: Thoughts on Search Analytics?

2011-05-06 Thread findbestopensource
1. Reports based on Location. Group by City / Country 2. Total search performed per hour / week / month 3. Frequently used search keywords 4. Analytics based on search keywords. Regards Aditya www.findbestopensource.com On Fri, May 6, 2011 at 3:55 AM, Otis Gospodnetic wrote: > Hi, > > I'd like

Re: Is it possible to use sub-fields or multivalued fields for boosting?

2011-05-04 Thread findbestopensource
Hello deniz, You could create a new field say FullName which is a of firstname and surname. Search on both the new field and location but boost up the new field query. Regards Aditya www.findbestopensource.com On Thu, May 5, 2011 at 9:21 AM, deniz wrote: > okay... let me make the situation

Re: [ANNOUNCE] Web Crawler

2011-03-02 Thread findbestopensource
Hello Dominique Bejean, Good job. We identified almost 8 open source web crawlers http://www.findbestopensource.com/tagged/webcrawler I don't know how far yours would be different from the rest. Your license states that it is not open source but it is free for personnel use. Regards Aditya ww

Re: Does Solr supports indexing & search for Hebrew.

2011-01-18 Thread findbestopensource
You may need to use Hebrew analyzer. http://www.findbestopensource.com/search/?query=hebrew Regards Aditya www.findbestopensource.com On Tue, Jan 18, 2011 at 2:34 PM, prasad deshpande < prasad.deshpand...@gmail.com> wrote: > Hello, > > With reference to below links I haven't found Hebrew suppo

Re: Spatial Search - Best choice ?

2010-07-15 Thread findbestopensource
Some more pointers to spatial search, http://www.jteam.nl/products/spatialsolrplugin.html http://code.google.com/p/spatial-search-lucene/ http://sujitpal.blogspot.com/2008/02/spatial-search-with-lucene.html Regards Aditya www.findbestopensource.com On Thu, Jul 15, 2010 at 3:54 PM, Saïd Radhoua

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
s). > > 2010/7/14 findbestopensource : > > I have just provided you two options. Since you already store as part of > the > > index, You could try external caching. Try using ehcache / Membase > > http://www.findbestopensource.com/tagged/distributed-caching . The >

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
ad more into > memory, I want to compress it "in memory". I don't care much about > disk space so whether or not it's compressed in lucene . > > 2010/7/14 findbestopensource : > > You have two options > > 1. Store the compressed text as part of store

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
You have two options 1. Store the compressed text as part of stored field in Solr. 2. Using external caching. http://www.findbestopensource.com/tagged/distributed-caching You could use ehcache / Memcache / Membase. The problem with external caching is you need to synchronize the deletions and

Re: Use of EmbeddedSolrServer

2010-06-11 Thread findbestopensource
Refer http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer Regards Aditya www.findbestopensource.com On Fri, Jun 11, 2010 at 2:25 PM, Robert Naczinski < robert.naczin...@googlemail.com> wrote: > Hello experts, > > we would like to use Solr in our search application. We want to index > a large i

Re: Indexing link targets in HTML fragments

2010-06-07 Thread findbestopensource
Could you tell us your schema used for indexing. In my opinion, using standardanalyzer / Snowball analyzer will do the best. They will not break the URLs. Add href, and other related html tags as part of stop words and it will removed while indexing. Regards Aditya www.findbestopensource.com On

Re: logic for auto-index

2010-06-02 Thread findbestopensource
You need to do schedule your task. Check out schedulers available in all programming languages. http://www.findbestopensource.com/tagged/job-scheduler Regards Aditya www.findbestopensource.com On Wed, Jun 2, 2010 at 2:39 PM, Jonty Rhods wrote: > Hi Peter, > > actually I want the index process

Re: Query Question

2010-06-02 Thread findbestopensource
What analyzer you are using to index and search? Check out schema.xml. You are currently using analyzer which breaks the words. If you don't want to break then you need to use . Regards Aditya www.findbestopensource.com On Wed, Jun 2, 2010 at 2:41 PM, M.Rizwan wrote: > Hi, > > I have solr 1.4.

Re: newbie question on how to batch commit documents

2010-05-31 Thread findbestopensource
Add commit after the loop. I would advise to use commit in a separate thread. I do keep separate timer thread, where every minute I will do commit and at the end of every day I will optimize the index. Regards Aditya www.findbestopensource.com On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo wrote: >

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
go about that? > > Regards, > Raakhi > > > On Tue, May 25, 2010 at 5:07 PM, findbestopensource < > findbestopensou...@gmail.com> wrote: > > > Resending it as there is a typo error. > > > > To reterive all documents, You need to use the query/filter

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
Resending it as there is a typo error. To reterive all documents, You need to use the query/filter FieldName:*:* . Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:29 PM, findbestopensource < findbestopensou...@gmail.com> wrote: > To reterive all documents, You ne

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani wrote: > Hi, > Is there any way to get all the fields (irrespective of whether > it contains a value or null) in solrDocu

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani wrote: > Hi, > Is there any way to get all the fields (irrespective of whether > it contains a value or null) in solrDo

Re: Personalized Search

2010-05-20 Thread findbestopensource
Hi Rih, You going to include either of the two field "bought" or "like" to per member/visitor OR a unique field per member / visitor? If it's one or two common fields are included then there will not be any impact in performance. If you want to include unique field then you need to consider multi

Re: Moving from Lucene to Solr?

2010-05-19 Thread findbestopensource
Hi Peter, You need to use Lucene, - To have more control - You cannot depend on any Web server - To use termvector, termdocs etc - You could easily extend to have your own Analyzer You need to use Solr, - To index and search docs easily by writting few code - Solr is a standal

Re: Solr Deployment Question

2010-05-13 Thread findbestopensource
ot;? No queries on the masters. > Only one index is being processed/optimized. > > Also, if I may add to my same question, how can I find the > amount of memory that an index would use, theoretically? > i.e.: Is there a formulae etc? > > Thanks > Madu > > >

Re: Solr Deployment Question

2010-05-13 Thread findbestopensource
You may use one index at a time, but both indexes are active and loaded all its terms in memory. Memory consumption will be certainly more. Regards Aditya http://www.findbestopensource.com On Fri, May 14, 2010 at 10:28 AM, Maduranga Kannangara < mkannang...@infomedia.com.au> wrote: > Hi > > We u

Re: multi-valued associated fields

2010-05-12 Thread findbestopensource
Hello Eric, Certainly it is possible. I would strongly advice to have field which differentiates the record type (RECORD_TYPE:"CAR" / "PROPERTY"). >>In general I was also wondering how Solr developers implement websites that uses tag filters.For example, a user clicks on "Hard drives" then get ta

Re: "Solr 1.4 Enterprise Search Server" book examples

2010-04-27 Thread findbestopensource
I downloaded the 5883_Code.zip file but not able to extract the complete contents. Regards Aditya www.findbestopensource.com On Tue, Apr 27, 2010 at 12:45 AM, Johan Cwiklinski < johan.cwiklin...@ajlsm.com> wrote: > Hello, > > Le 26/04/2010 20:53, findbestopensource a écrit : &

Re: "Solr 1.4 Enterprise Search Server" book examples

2010-04-26 Thread findbestopensource
I am able to successfully download the code. It is of 360 MB and took lot of time to download. https://www.packtpub.com/solr-1-4-enterprise-search-server/book Select the "download the code" link and provide your email id, Download link will be sent via email. Regards Aditya www.findbestopensource.

Re: hybrid approach to using cloud servers for Solr/Lucene

2010-04-25 Thread findbestopensource
Hello Dennis >>If the load goes up, then queries are sent to the cloud at a certain point. My advice is to do load balancing between local and cloud. Your local system seems to be capable as it is a dedicated host. Another option is to do indexing in local and sync it with cloud. Cloud will be on

Re: Best Open Source

2010-04-22 Thread findbestopensource
Thank you Dave and Michael for your feedback. We are currently in beta and we will fix these issues sooner. Regards Aditya www.findbestopensource.com On Tue, Apr 20, 2010 at 3:01 PM, Michael Kuhlmann < michael.kuhlm...@zalando.de> wrote: > Nice site. Really! > > In addition to Dave: > How do