Re: How fast indexing?

2016-03-21 Thread Amit Jha
AJ > On 22-Mar-2016, at 05:32, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 3/20/2016 6:11 PM, Amit Jha wrote: >> In my case I am using DIH to index the data and Query is having 2 join >> statements. To index 70K documents it is taking 3-4Hours. Document size

Re: How fast indexing?

2016-03-21 Thread Amit Jha
Yes, I do have multiple modes in my solr cloud setup. Rgds AJ > On 21-Mar-2016, at 22:20, fabigol <fabien.stou...@vialtis.com> wrote: > > Amit Jha, > do you have several sold server with solr cloud? > > > > > -- > View this message in context: > http

Re: How fast indexing?

2016-03-20 Thread Amit Jha
Hi All, In my case I am using DIH to index the data and Query is having 2 join statements. To index 70K documents it is taking 3-4Hours. Document size would be around 10-20KB. DB is MSSQL and using solr4.2.10 in cloud mode. Rgds AJ > On 21-Mar-2016, at 05:23, Erick Erickson

SolrCloud Document Update Problem

2015-06-29 Thread Amit Jha
Hi, I setup a SolrCloud with 2 shards each is having 2 replicas with 3 zookeeper ensemble. We add and update documents from web app. While updating we delete the document and add same document with updated values with same unique id. I am facing a very strange issue that some time 2 documents

Re: SolrCloud Document Update Problem

2015-06-29 Thread Amit Jha
It was because of the issues Rgds AJ On Jun 29, 2015, at 6:52 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Mon, Jun 29, 2015 at 4:37 PM, Amit Jha shanuu@gmail.com wrote: Hi, I setup a SolrCloud with 2 shards each is having 2 replicas with 3 zookeeper ensemble. We

Real Time indexing and Scalability

2015-06-05 Thread Amit Jha
Hi, In my use case, I am adding a document to Solr through spring application using spring-data-solr. This setup works well with single Solr. In current setup it is single point of failure. So we decided to use solr replication because we also need centralized search. Therefore we setup two

Re: Real Time indexing and Scalability

2015-06-05 Thread Amit Jha
I want to have realtime index and realtime search. Rgds AJ On Jun 5, 2015, at 10:12 PM, Amit Jha shanuu@gmail.com wrote: Hi, In my use case, I am adding a document to Solr through spring application using spring-data-solr. This setup works well with single Solr. In current setup

Re: Real Time indexing and Scalability

2015-06-05 Thread Amit Jha
you won't see docs on the slave until after the polling interval is expired and the index is replicated. 2 In SolrCloud you aren't committing appropriately. You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Fri, Jun 5, 2015 at 9:45 AM, Amit Jha shanuu

Re: Real Time indexing and Scalability

2015-06-05 Thread Amit Jha
on both. If I setup replication between 2 servers and configure both as repeater, than both can act master and slave for each other. Therefore writing can be done on both. Rgds AJ On Jun 6, 2015, at 1:26 AM, Shawn Heisey apa...@elyograg.org wrote: On 6/5/2015 1:38 PM, Amit Jha wrote: Thanks Eric

Re: Real Time indexing and Scalability

2015-06-05 Thread Amit Jha
are only ever indexing to the master on DC1. Best, Erick On Fri, Jun 5, 2015 at 1:20 PM, Amit Jha shanuu@gmail.com wrote: Thanks Shawn, for reminding CloudSolrServer, yes I have moved to SolrCloud. I agree that repeater is a slave and acts as master for other slaves. But still it's

Re: Retrieving Phonetic Code as result

2015-01-23 Thread Amit Jha
/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html and in solrconfig.xml -- Jack Krupansky On Thu, Jan 22, 2015 at 8:42 AM, Amit Jha shanuu@gmail.com wrote: Hi, I need to know how can I retrieve phonetic codes. Does solr provide it as part of result? I need codes

Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
Hi, I need to know how can I retrieve phonetic codes. Does solr provide it as part of result? I need codes for record matching. *following is schema fragment:* fieldtype name=phonetic stored=true indexed=true class=solr.TextField analyzer type=index tokenizer

Re: Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
Hi, I need to know how can I retrieve phonetic codes. Does solr provide it as part of result? I need codes for record matching. *following is schema fragment:* fieldtype name=phonetic stored=true indexed=true class=solr.TextField analyzer type=index tokenizer

Re: Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
it, why can't solr On Thu, Jan 22, 2015 at 7:54 PM, Amit Jha shanuu@gmail.com wrote: Hi, I need to know how can I retrieve phonetic codes. Does solr provide it as part of result? I need codes for record matching. *following is schema fragment:* fieldtype name=phonetic stored=true indexed

Re: De Duplication using Solr

2015-01-03 Thread Amit Jha
/confluence/display/solr/De-Duplication -- Jack Krupansky On Sat, Jan 3, 2015 at 2:54 AM, Amit Jha shanuu@gmail.com wrote: I am trying to find out duplicate records based on distance and phonetic algorithms. Can I utilize solr for that? I have following fields and conditions to identify

De Duplication using Solr

2015-01-02 Thread Amit Jha
I am trying to find out duplicate records based on distance and phonetic algorithms. Can I utilize solr for that? I have following fields and conditions to identify exact or possible duplicates. 1. Fields prefix suffix firstname lastname email(primary_email1, email2, email3) phone(primary_phone1,

Re: different fields for user-supplied phrases in edismax

2014-12-12 Thread Amit Jha
Hi Mike, What is exact your use case? What do mean by controlling the fields used for phrase queries ? Rgds AJ On 12-Dec-2014, at 20:11, Michael Sokolov msoko...@safaribooksonline.com wrote: Doug - I believe pf controls the fields that are used for the phrase queries *generated by

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Amit Jha
Solr will complaint only if you brought down both replica leader of same shard. It would be difficult to have highly available env. If you have less number of physical servers. Rgds AJ On 18-Feb-2014, at 18:35, Vineet Mishra clearmido...@gmail.com wrote: Hi All, I want to have clear

SolrCloud Cluster Setup - Shard Replica

2014-01-18 Thread Amit Jha
Hi, I tried to create 2 shard cluster with shard replica of a collection. For this set up I used two physical machines. In this set up I have installed 1 shard and replica in Machine A and another 1 shard and 1 replica in Machine B. Now when I stop both shard and replica on machine B. I was not

Index size - to determine storage

2014-01-09 Thread Amit Jha
Hi, I would like to know if I index a file I.e PDF of 100KB then what would be the size of index. What all factors should be consider to determine the disk size? Rgds AJ

Re: DateField - Invalid JSON String Exception - converting Query Response to JSON Object

2014-01-07 Thread Amit Jha
I am using it. But timestamp having : in between causes the issue. Please help On Tue, Jan 7, 2014 at 11:46 AM, Ahmet Arslan iori...@yahoo.com wrote: Hi Amit, If you want json response, Why don't you use wt=json? Ahmet On Tuesday, January 7, 2014 7:34 AM, Amit Jha shanuu@gmail.com

Re: DateField - Invalid JSON String Exception - converting Query Response to JSON Object

2014-01-07 Thread Amit Jha
Hey Hoss, Thanks for replying back..Here is the response generated by solrj. *SolrJ Response*: ignore the Braces at It have copied it from big chunk Response: {responseHeader={status=0,QTime=0,params={lowercaseOperators=true,sort=score

DateField - Invalid JSON String Exception - converting Query Response to JSON Object

2014-01-06 Thread Amit Jha
Hi, Wish You All a Very Happy New Year. We have index where date field have default value as 'NOW'. We are using solrj to query solr and when we try to convert query response(response.getResponse) to JSON object in java. The JSON API(org.json) throws 'invalid json string' exception. API say

Re: DateField - Invalid JSON String Exception - converting Query Response to JSON Object

2014-01-06 Thread Amit Jha
AM, Amit Jha shanuu@gmail.com wrote: Hi, Wish You All a Very Happy New Year. We have index where date field have default value as 'NOW'. We are using solrj to query solr and when we try to convert query response(response.getResponse) to JSON object in java. The JSON API(org.json

Re: Committing when indexing in parallel

2013-09-14 Thread Amit Jha
Hi, As per my knowledge, any number of requests can be issued in parallel for index the documents. Any commit request will write them to index. So if P1 issued a commit then all documents of P2 those are eligible get committed and remaining documents will get committed on other commit

Re: MySQL Data import handler

2013-09-14 Thread Amit Jha
Hi Baskar, Just create a single schema.xml which should contains required fields from 3 tables. Add a status column to child table.i.e 1 = add 2 = update 3 = delete 4 = indexed Etc Write a program using solrj which will read the status and do thing accordingly. Rgds AJ On 15-Sep-2013,

Re: Solr Java Client

2013-09-14 Thread Amit Jha
Add a field called source in schema.xml and value would be your table names. Rgds AJ On 15-Sep-2013, at 5:38, Baskar Sikkayan baskar@gmail.com wrote: Hi, I am new to Solr and trying to use Solr java client instead of using the Data handler. Is there any configuration i need to do

Re: Solr Java Client

2013-09-14 Thread Amit Jha
Question is not clear to me. Please be more elaborative in your query. Why do u want to store index to DB tables? Rgds AJ On 15-Sep-2013, at 7:20, Baskar Sikkayan baskar@gmail.com wrote: How to add index to 3 diff tables from java ... On Sun, Sep 15, 2013 at 6:49 AM, Amit Jha shanuu

Re: Combining Solr score with customized user ratings for a document

2013-09-10 Thread Amit Jha
You can use DB for storing user preferences and later if you want you can flush them to solr as an update along with userid. Or you may add a result pipeline filter Rgds AJ On 13-Feb-2013, at 17:50, Á_o chachime...@yahoo.es wrote: Hi: I am working on a proyect where we want to

Re: More on topic of Meta-search/Federated Search with Solr

2013-08-26 Thread Amit Jha
Hi, I would suggest for the following. 1. Create custom search connectors for each individual sources. 2. Connector will responsible to query the source of any type web, gateways etc. and get the results write the top N results to a solr. 3. Query the same keyword to solr and display the

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Amit Jha
Add to Jack reply, Solr can also be embed into the application and can run on same process. Solr, the server-I zation of lucene. The line is very blurred and solr is not a very thin wrapper around lucene library. Most solr features are distinct from lucene like - detailed breakdown of

Re: Disable term frequency for some fields in solr

2013-01-16 Thread Amit Jha
Hi, How can I do this in solr4. Amit On Thu, Dec 6, 2012 at 1:40 PM, Markus Jelsma markus.jel...@openindex.iowrote: custom similarity for that field that returns 1 for

Re: Disable term frequency for some fields in solr

2013-01-16 Thread Amit Jha
Done same thing in solr3.6 and working but in sorl3.6 filed level of similarity is not available. And Solr4 has Similarity Factories. So I was not getting how do I do it on solr4. Which class do i need to extend and move ahead. On Wed, Jan 16, 2013 at 4:44 PM, Upayavira u...@odoko.co.uk wrote:

Re: Search strategy - improving search quality for short search terms such as doll

2013-01-16 Thread Amit Jha
Its all about the data data set, here I mean index. If you have documents containing toy and doll it will return that in result set. What I understood that you are talking about the context of the query. For example if you search books on MK Gandhi and books by MK Gandhi both queries have

Re: Priorities on fields

2013-01-16 Thread Amit Jha
Boost query and Boost function will suffice your purpose. Rgds AJ On 16-Jan-2013, at 17:20, Dariusz Borowski darius...@gmail.com wrote: Hi, Is it possible to define priorities on fields? Lets say I have a product table which has the following fields: - id - title - description -

Re: Disable term frequency for some fields in solr

2013-01-16 Thread Amit Jha
Please correct my understanding, Use one of the factory as global similarity. And extends org.apache.lucene.search.similarities.DefaultSimilarity to create custom sim. And add a similarity tag in field type definition for required fields. Or there is some other way to do that? Rgds AJ On

Re: Disable term frequency for some fields in solr

2013-01-16 Thread Amit Jha
It will affect the phrase queries. That is why I am not using suggest configuration. On Thu, Jan 17, 2013 at 7:20 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : Or there is some other way to do that? I'm late to this thread, but what was wrong with the simple suggestion of

Re: Incremental Update of index

2012-12-05 Thread Amit Jha
Thanks Sandeep, How can it done when using a database because database has all the records old, new and updated. On Wed, Dec 5, 2012 at 11:47 PM, Sandeep Mestry sanmes...@gmail.com wrote: Hi Amit/Shanu, You can create the solr document for only the updated record and index it to ensure only

Re: distributed search

2012-06-22 Thread Amit Jha
Ashutosh, Do you want to import data to solr?please explain the use case. How you are performing a search in current scenario? And what is expected from solr? Rgds AJ On 22-Jun-2012, at 15:09, Ashutosh Puspwan ashu.pusp...@gmail.com wrote: Dear Sir/Mam I am a beginner in apache solr. I

Re: How can I optimize Sorting on multiple text fields

2012-06-22 Thread Amit Jha
On 22-Jun-2012, at 11:30, Alok Bhandari alokomprakashbhand...@gmail.com wrote: Hello, the requirement which I have is that on solr side we have indexed data of multiple customers and each customer we have at least a million documents. After executing search end user want to sort on some

Re: Multicore solr

2012-05-23 Thread Amit Jha
is indexed (with a keyword tokenizer) and everything else only stored? Also, are you sure that Solr is the best option as a key-value store? Jens On 05/23/2012 04:34 AM, Amit Jha wrote: Hi, Thanks for your advice. It is basically a meta search application. Users can perform a search on N

Re: Multicore solr

2012-05-22 Thread Amit Jha
Hi, Thanks for your advice. It is basically a meta search application. Users can perform a search on N number of data sources at a time. We broadcast Parallel search to each selected data sources and write data to solr using custom build API(API and solr are deployed on separate machine API