Re: Search in Solr Index

2015-04-20 Thread Yavar Husain
There might be issues with your default search field. Suppose if you are searching field named MyTestField then give your query as MyTestField:Birmingham and see if you get any results. As Matt suggested there might be some issues with the way you have done tokenization/analysis etc. On Mon,

Re: search by person name

2015-04-20 Thread Yavar Husain
In this case q=name:(ana jose) will work, but suppose if it is to be searched in full text field It might have poor recall, It will also produce document like San Jose is better than Santa Ana which was not the user intent. Erick's solution ana jose~2 is capturing the intent too. On Mon, Apr

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Yavar Husain
Well have indexed heterogeneous sources including a variety of NoSQL's, RDBMs and Rich Documents (PDF Word etc.) using SolrJ. The only prerequisite of using SolrJ is that you should have an API to fetch data from your data source (Say JDBC for RDBMS, Tika for extracting text content from rich

Information Retrieval/Text Mining opportunity @ GE Research Data Mining Labs, Bangalore

2015-03-25 Thread Yavar Husain
limited or no experience with the areas mentioned above but is passionate about Information Retrieval/Text Mining have rock solid background in Algorithms is encouraged to apply/connect. Check out more on GE Research: http://www.geglobalresearch.com/ Cheers, Yavar Husain Lead Data Scientist

Pattern for extracting text from a rich document and an associated metadata file

2015-03-04 Thread Yavar Husain
What is the best pattern to index the following kind of data: HarryPotter.PDF HarryPotter.txt Avengers.Docx Avengers.txt For each of the above file the meta data lies in the text file having same name as the rich document (as can be seen above). (1) Now the brute force method that I can think

Re: Is Solr best for did you mean functionality just like Google?

2015-02-24 Thread Yavar Husain
Solr is an IR system where Spell correction is a topping however Google has a team dedicated just for Spell corrections. Did you mean (more general term and much broader than basic Spell correctors) or Spell Correctors require a plethora of skills. I will just discuss Spell correctors here and not

Solr Date Range not returning results for last 1 month

2014-12-23 Thread Yavar Husain
So my Solr date range query is as follows: facet.range=datefacet.range.start=NOW/DAY-36MONTHfacet.range.end=NOW/DAYfacet.range.gap=%2B1MONTH I need facets for past 36 months or 3 year and everything is fine except for data not being returned for last 1 month, However the facets I am getting for

Re: Solr Date Range not returning results for last 1 month

2014-12-23 Thread Yavar Husain
of the plus sign... Best, Erick On Tue, Dec 23, 2014 at 9:55 PM, Yavar Husain yavarhus...@gmail.com wrote: So my Solr date range query is as follows: facet.range=datefacet.range.start=NOW/DAY-36MONTHfacet.range.end=NOW/DAYfacet.range.gap=%2B1MONTH I need facets for past 36 months or 3

Solr Clustering component different results than Carrot workbench

2014-08-18 Thread Yavar Husain
Though I am interacting with Dawid (creator of Carrot2) on Carrot2 mailing list however just wanted to post my problem to a wider audience. I am using Solr 4.7 (on both windows and linux) and saved my lingo-attributes.xml file from the workbench which I am using in Solr. Note that for testing I

Data Import Handler - resource not found - Jetty - Windows 7

2014-07-25 Thread Yavar Husain
Have most of experience working on Solr with Tomcat. However I recently started with Jetty. I am using Solr 4.7.0 on Windows 7. I have configured solr properly and am able to see the admin UI as well as velocity browse. Dataimporthandler screen is also getting displayed. However when I do a full

Re: Solr Cassandra MySQL Best Practice Indexing

2014-07-22 Thread Yavar Husain
in that Solr-enabled Cassandra data center just the same as with normal Solr. -- Jack Krupansky -Original Message- From: Yavar Husain Sent: Monday, July 21, 2014 8:37 AM To: solr-user@lucene.apache.org Subject: Solr Cassandra MySQL Best Practice Indexing So my full text data lies

Re: Solr Cassandra MySQL Best Practice Indexing

2014-07-22 Thread Yavar Husain
rather than a robust architecture. -- Jack Krupansky -Original Message- From: Yavar Husain Sent: Tuesday, July 22, 2014 2:22 AM To: solr-user@lucene.apache.org Subject: Re: Solr Cassandra MySQL Best Practice Indexing Thanks Jack for your guidance on DSE. However it would be great

Solr Cassandra MySQL Best Practice Indexing

2014-07-21 Thread Yavar Husain
So my full text data lies on Cassandra along with an ID. Now I have a lot of structured data linked to the ID which lies on an RDBMS (read MySQL). I need this structured data as it would help me with my faceting and other needs. What is the best practice in going about indexing in this scenario.

Research Scientist - Information Retrieval at GE Global Research (Data Mining Lab)

2014-04-24 Thread Yavar Husain
to apply/connect. Cheers, Yavar Husain Lead Data Scientist - Text Mining Laboratory GE Research, Bangalore LinkedIn: http://www.linkedin.com/pub/yavar-husain/5/805/151 Text@ yavarhus...@gmail.com