SolrCloud Zookeeper disconnection/reconnection

2014-02-13 Thread lboutros
Dear all, we are currenty using Solr 4.3.1 in production (With SolrCloud). We encounter quite the same problem described in this other old post: http://lucene.472066.n3.nabble.com/SolrCloud-CloudSolrServer-Zookeeper-disconnects-and-re-connects-with-heavy-memory-usage-consumption-td4026421.html

Re: filtering/faceting by a big list of IDs

2014-02-13 Thread Tri Cao
Hi Joel,Thanks a lot for the suggestion.After thinking more about this, I think I could skip the faceting count for now,and so just provide a filtering option without display how many items that wouldbe there after filtering. After all, even Google Shopping product search doesn'tdisplay the facet

Algorithm for retrieving documents

2014-02-13 Thread Harshvardhan Ojha
Hi All, I have a question regarding retrieval of documents by lucene. I know lucene uses many files on disk to keep documents, each comprising fields in it, and uses many IR algorithms, and inverted index to match documents. My question is : 1. How lucene stores these documents inside file

Re: Algorithm for retrieving documents

2014-02-13 Thread Mikhail Khludnev
Hello I think you can start from http://www.lucenerevolution.org/2013/What-is-in-a-lucene-index On Thu, Feb 13, 2014 at 12:56 PM, Harshvardhan Ojha ojha.harshvard...@gmail.com wrote: Hi All, I have a question regarding retrieval of documents by lucene. I know lucene uses many files on

Re: Algorithm for retrieving documents

2014-02-13 Thread Harshvardhan Ojha
Hi Mikhail, Thanks for sharing this nice link. I am pretty comfortable with searching of lucene and this is very beginner level question on storage, mainly Hashing part(storage and retrieval). Which DS(I don't know currently), is being used to keep and again calculate that hash to get document

Re: Algorithm for retrieving documents

2014-02-13 Thread Mikhail Khludnev
Harshvardhan, There almost nothing like this in bare Lucene, the closest analogy is http://wiki.apache.org/solr/SolrCaching#documentCache On Thu, Feb 13, 2014 at 1:46 PM, Harshvardhan Ojha ojha.harshvard...@gmail.com wrote: Hi Mikhail, Thanks for sharing this nice link. I am pretty

Re: Facet optimization for facet.method=enum and exists case

2014-02-13 Thread Annette Newton
Hi Alexey, I would be very interested in your progress with this. Your use case seems to match ours, we found enum to be much quicker than fc particularly for multivalued fields. We found that fc caused memory issues and caused us to frequently lose nodes. We, like you, have no interest in the

Re: Join Scoring

2014-02-13 Thread Michael McCandless
I suspect (not certain) one reason for the performance difference with Solr vs Lucene joins is that Solr operates on a top-level reader? This results in fast joins, but it means whenever you open a new reader (NRT reader) there is a high cost to regenerate the top-level data structures. But if

Re: Algorithm for retrieving documents

2014-02-13 Thread Harshvardhan Ojha
Hi Mikhail, Don't you think org.apache.lucene.codecs.bloom.FuzzySet.java, contains(BytesRef value) methods returns probability of having a field, and it is a place where we are using hashing ? Are there any other place in source which when given with document id, could determine by calculating

Re: Join Scoring

2014-02-13 Thread anand chandak
Thanks Mike, that surely helps to clarify the difference. On the related note, if we have provide a scoring support for solr join, instead of using lucene join, what would be best way to do that . There's one suggestion that david gave below :- build a custom QParser and call Lucene's JOIN

Re: Need help with delta import

2014-02-13 Thread thammegowda
I was having similar problem with delta import. I am using solr 4.6 and making use of ${dih.last_index_time}, ${dih.delta.xxx} shorter variable names. I think the issue in previously discussed posts in the thread lies in deltaQuery and deltaImportQuery. if deltaQuery=select *rowId* from

Re: APACHE SOLR: Pass a file as query parameter and then parse each line to form a criteria

2014-02-13 Thread Roman Chyla
Hi Rajeev, You can take this: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E I haven't created the jira yet, but I have improved the plugin. Recently, I have seen a use case of passing 90K identifiers

Re: filtering/faceting by a big list of IDs

2014-02-13 Thread Roman Chyla
Hi Tri, Look at this: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E Roman On 13 Feb 2014 03:39, Tri Cao tm...@me.com wrote: Hi Joel, Thanks a lot for the suggestion. After thinking more about

block join and atomic updates

2014-02-13 Thread mm
Hello, I'm using block join to store nested documents with a huge number of children. I want to update some fields in the parent document using atomic updates, because I don't want to re-index all the child documents again. So, as far as I understood atomic updates, solr is reindexing the

Re: block join and atomic updates

2014-02-13 Thread Yonik Seeley
On Thu, Feb 13, 2014 at 8:25 AM, m...@preselect-media.com wrote: Is there any workaround to perform atomic updates on blocks or do I have to re-index the parent document and all its children always again if I want to update a field? The latter, unfortunately. -Yonik http://heliosearch.org -

Re: block join and atomic updates

2014-02-13 Thread mm
Yonik Seeley yo...@heliosearch.com: On Thu, Feb 13, 2014 at 8:25 AM, m...@preselect-media.com wrote: Is there any workaround to perform atomic updates on blocks or do I have to re-index the parent document and all its children always again if I want to update a field? The latter,

Multiple Column Condition with Relevance/Rank

2014-02-13 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hello, Someone can help me on implementing the below query in Solr, I will using a rank in MS SQL and a return distinct Productid Select productid from products where SKU = 101 Select Productid from products where ManufactureSKU = 101 Select Productid from product where SKU Like 101% Select

Re: Multiple Column Condition with Relevance/Rank

2014-02-13 Thread Jack Krupansky
Use the OR operator between the specific clauses. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) Sent: Thursday, February 13, 2014 9:09 AM To: solr-user@lucene.apache.org Subject: Multiple Column Condition with Relevance/Rank

ORA-04030: out of process memory when trying to allocate 4032 bytes - Please advise

2014-02-13 Thread Murugan, Muniraja (CTR) Offshore
Dear All, I am getting ORA-04030: out of process memory when trying to allocate 4032 bytes error when I try to index xmltype data from Oracle DB. I have registered in Solr support the ID is SOLR-5723 I have xmltype data in my oracle DB. I am converting xmltype to clob and to xml string using

Reviewers required for Apache Solr Beginner’s Guide!!

2014-02-13 Thread leon715
Hi guys, I'm looking for reviewers for our newly published book titled, ‘Apache Solr Beginner’s Guide’ I'll be happy to provide a free eBook copy to anyone interested in writing a review for this book specifically on their Blog/Website, Amazon, Goodreads, Dzone within 2-3 weeks after receiving

Highlight span queries

2014-02-13 Thread Puneet Pawaia
Hi I am using SOLR 4.6. I need to highlight span queries. But when I fire a span query, there are no highlight snippets. Is there any highlighter out there that can be used? Regards Puneet

Re: ORA-04030: out of process memory when trying to allocate 4032 bytes - Please advise

2014-02-13 Thread Shawn Heisey
On 2/13/2014 9:20 AM, Murugan, Muniraja (CTR) Offshore wrote: I am getting ORA-04030: out of process memory when trying to allocate 4032 bytes error when I try to index xmltype data from Oracle DB. I have registered in Solr support the ID is SOLR-5723 The issue tracker is for bugs and

facet problem

2014-02-13 Thread Kishan Parmar
hi when i do quering status:verified for my core and adding facet then i get result but for facet it wrong output as verifi not verified why this is happaning ?? any suggestion for this Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !!

Re: facet problem

2014-02-13 Thread Saumitra Srivastav
Change the type of 'status' field to 'string'. At present it must be 'text_en' and hence while indexing its getting tokenized. After changing the type in schema.xml you will have to re-index the documents to see the expected facet response. -- View this message in context:

Re: facet problem

2014-02-13 Thread Ahmet Arslan
Hi Kishan, Facets are generated from indexed values. Probably you have stemmer in your analysis chain. Thus 'verified' is indexed as 'verifi'. Remove stemming filter factory from  your field type definition. Ahmet On Thursday, February 13, 2014 8:53 PM, Kishan Parmar kishan@gmail.com

Re: facet problem

2014-02-13 Thread Kishan Parmar
thanks for help.. its working Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !! On Fri, Feb 14, 2014 at 12:36 AM, Saumitra Srivastav saumitra.srivast...@gmail.com wrote: Change the type of 'status' field to 'string'. At present it must be 'text_en' and hence

SolrJ Socket Leak

2014-02-13 Thread Jared Rodriguez
I am using solr/solrj 4.6.1 along with the apache httpclient 4.3.2 as part of a web application which connects to the solr server via solrj using CloudSolrServer(); The web application is wired up with Guice, and there is a single instance of the CloudSolrServer class used by all inbound

Re: change character correspondence in icu lib

2014-02-13 Thread alxsss
I found out that generated files are the same. I think this is because that these lines inside build file target name=gen-utr30-data-files depends=compile-tools java classname=org.apache.lucene.analysis.icu.GenerateUTR30DataFiles dir=${utr30.data.dir} fork=true

Re: change character correspondence in icu lib

2014-02-13 Thread alxsss
I found out that generated files are the same. I think this is because that these lines inside build file target name=gen-utr30-data-files depends=compile-tools java classname=org.apache.lucene.analysis.icu.GenerateUTR30DataFiles dir=${utr30.data.dir} fork=true

Re: SolrJ Socket Leak

2014-02-13 Thread Shawn Heisey
On 2/13/2014 1:38 PM, Jared Rodriguez wrote: I am using solr/solrj 4.6.1 along with the apache httpclient 4.3.2 as part of a web application which connects to the solr server via solrj using CloudSolrServer(); The web application is wired up with Guice, and there is a single instance of the

RE: Facet optimization for facet.method=enum and exists case

2014-02-13 Thread Alexey Kozhemiakin
Hi Annette, You might want to find initial version of patch attached https://issues.apache.org/jira/browse/SOLR-5725 I'd be happy to find out performance improvement on your setup, let me know if you need help with patching your version of solr. -- Alexey -Original Message- From:

Re: java.lang.IllegalArgumentException when using SolrJ CloudSolrServer

2014-02-13 Thread jfeist
That did fix my issue, thanks so much. -- View this message in context: http://lucene.472066.n3.nabble.com/java-lang-IllegalArgumentException-when-using-SolrJ-CloudSolrServer-tp4116585p4117279.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ Socket Leak

2014-02-13 Thread Shawn Heisey
On 2/13/2014 3:17 PM, Jared Rodriguez wrote: I just regressed to Solrj 4.6.1 with http client 4.2.6 and am trying to reproduce the problem. Using YourKit to profile and even just manually simulating a few users at once, I see the same problem of open sockets. 6 sockets opened to the solr

Unloading a SolrCloud core in 4.6.0

2014-02-13 Thread Lajos
Hi all, I just want to verify that it is no longer possible to unload a Cloud core via the Core API UNLOAD command, correct? I had two situations: one where I wanted to remove old replicas in a node that I was deactivating (and I had already created new replicas) and one where I needed to

Solr dataimporthandler use storedProcedure in deltaImportQuery

2014-02-13 Thread Jay Potharaju
I was wondering if its possible to call a storedProcedure in the deltaImportQuery. This is what I 'm trying to do. entity name=entity1 transformer=RegexTransformer pk=id query=SELECT * FROM table1 INNER JOIN tabl2 ON table2.tbl1Id = table1.id

Re: Solr dataimporthandler use storedProcedure in deltaImportQuery

2014-02-13 Thread Ahmet Arslan
I think it is not possible to use stored procedures in DIH.  Please see : https://issues.apache.org/jira/browse/SOLR-1262 On Friday, February 14, 2014 3:02 AM, Jay Potharaju jayasunde...@gmail.com wrote: I was wondering if its possible to call a storedProcedure in the deltaImportQuery. This is

Re: Unloading a SolrCloud core in 4.6.0

2014-02-13 Thread Joel Bernstein
Lajos, Just did a quick test on 4.6.1 UNLOADing a SolrCloud replica using the core admin API and it worked as expected. I'm running with the core.properties setup. Are you running with an old style solr.xml? The org.apache.solr.core.CorePropertiesLocator.delete() method in trunk and 4x are

Re: Solr dataimporthandler use storedProcedure in deltaImportQuery

2014-02-13 Thread Shawn Heisey
On 2/13/2014 5:53 PM, Jay Potharaju wrote: I was wondering if its possible to call a storedProcedure in the deltaImportQuery. This is what I 'm trying to do. entity name=entity1 transformer=RegexTransformer pk=id query=SELECT * FROM table1 INNER JOIN tabl2 ON