Re: Solr design. Choose Cores or Shards?

2013-08-08 Thread Dhananjay Makwana
Thanks for the explanation ! On 8/8/13 4:52 AM, Shawn Heisey wrote: On 8/6/2013 8:49 PM, manju16832003 wrote: My Confusion is it feasible to choose many cores or use shards. I do not have much experience on how shards works and why they are used for. I would like to know the suggestions :-)

Document generation from database and partially from other source for the same item

2013-08-08 Thread payalsharma
Hi all, We have a requirement in the ecommerce site that, Keywords string for items is required but just for searching purpose. Since keywords will be long and only used for searching thus we just want to be indexed and don't need them to persist in DB. Keywords will be there is the spreadsheet

Re: Document generation from database and partially from other source for the same item

2013-08-08 Thread Raymond Wiker
Assuming that you're doing this in a Windows environment, you could define your spreadsheet as an ODBC data source and define a datasource for it in DIH. Then, you would extract the main documents from your database, and the keywords from the ODBC datasource layered on top of your spreadsheet. No

First Indexing a postgres database

2013-08-08 Thread geoport
Hi, i am using solr the first time and i have a lot of problems. So at first i have installed a tomcat7, when i have copied the solr.war(original solr-4.2.1.war) into the webapps-directory of the tomcat. The second step was to copy the example\solr into a custom directory i called it c:\solr. The

Re: [POLL] Who how does use admin-extra ?

2013-08-08 Thread Bernd Fehling
I have a table of links to all my servers running SOLR. So I can jump from one admin page any other servers admin page. And also a link to my monitoring server. Not very innovative but better than an empty page. So, yes I'm using it. Regards, Bernd Am 08.08.2013 00:24, schrieb Stefan Matheis:

JSON Update create different copies of the same document

2013-08-08 Thread Bruno René Santos
Hello, I thought that by adding a new document with the same id on Solr the document already on Solr would be updated with the new info. But both documents appear on the search results... How can I update a document? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal

Re: JSON Update create different copies of the same document

2013-08-08 Thread Rafał Kuć
Hello! Do you have the unique identifier specified in the schema.xml ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hello, I thought that by adding a new document with the same id on Solr the document already on Solr would be updated with the

Problems installing Solr4 in Jetty9

2013-08-08 Thread Spadez
I've been unable to install SOLR into Jetty. Jetty seems to be running fine, and this is the steps I took to install solr: # SOLR cd /opt wget -O - $SOLR_URL | tar -xzf - cp solr-4.4.0/dist/solr-4.4.0.war /opt/jetty/webapps/solr.war cp -R solr-4.4.0/example/solr /opt/ cp -R /opt/solr-4.4.0/dist/

Re: Problems installing Solr4 in Jetty9

2013-08-08 Thread Rafał Kuć
Hello! Could you look at the logs and post what you find there? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I've been unable to install SOLR into Jetty. Jetty seems to be running fine, and this is the steps I took to install solr: # SOLR cd

Re: JSON Update create different copies of the same document

2013-08-08 Thread Bruno René Santos
Yes, id. I was just reading and maybe this happens because I do not send the _version_ with the document? I tried to send it with a value chosen (_versoin_=342342342) by me with Solr empty of documents and get the http response 409 with the reason conflict. Any ideas? Regards Bruno On Thu, Aug

Solr 4.4 Default shard

2013-08-08 Thread Prasi S
I have setup solr 4.4 with cloud and have created two cores mycore_shard1, mycore_shard2. I have few questions here, 1. Once the setup is ready, i could see a default collection collection with :shard1 in the admin - cloud page. How to remove it. I have deleted the core.properties file in the

DocValues for byte[] ... or a common codec for selected fields

2013-08-08 Thread Mathias Lux
Hi all! First of all: Solr is an amazing project. Big thanks to the community! I really appreciate the stability, and especially the pre-configured jetty example ;) And now for the question: I'm currently on my way to writing a RequestHandler for Solr that deals with content based image search

Re: Problems installing Solr4 in Jetty9

2013-08-08 Thread Spadez
Apparently this is the error: 2013-08-08 09:35:19.994:WARN:oejw.WebAppContext:main: Failed startup of context o.e.j.w.WebAppContext@64a20878{/solr,file:/tmp/jetty-0.0.0.0-8080-solr.war-_solr-any-/webapp/,STARTING}{/solr.war} org.apache.solr.common.SolrException: Could not find necessary SLF4j

Re: Solr 4.4 Default shard

2013-08-08 Thread Anshum Gupta
Hi Prasi, I'd highly recommend you to go through the SolrCloud wiki here: http://wiki.apache.org/solr/SolrCloud . When it comes to SolrCloud, you need to read about collections before you go any further. I don't know anything about your use case so I'm guessing you just probably are trying to

Re: Solr 4.4 Default shard

2013-08-08 Thread Prasi S
Initially i created a single collection, -java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost localhost:2181 -confdir solr-conf -confname *myconf1* --java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd linkconfig -zkhost 127.0.0.1:2181 -collection

Re: Transform data at index time: country - continent

2013-08-08 Thread Christian Köhler - ZFMK
Hi, I have thought about synonyms as well. But wouldn't leave me this with a field that contains both the original expression and additionally the continent? e.g. germany, continent-europe. I am not sure if this might get in the way at some point. On the other hand this would enable my to have

Re: Transform data at index time: country - continent

2013-08-08 Thread Christian Köhler - ZFMK
Hi, One interesting issue: These countries that span continents - Turkey and Russia and some of the former USSR Republics. I arbitrarily assigned them a single continent: // Note: Turkey is mapped to Asia, and Russia to Europe, // Azerbaijan to Asia, Armenia to Asia, Cyprus to Asia, //

RE: SOLR USING 100% percent CPU and not responding after a while

2013-08-08 Thread nitin4php
Hi Biva, Any luck on this? Even we are facing same issue with exactly same configuration and setup. Any inputs will help a lot. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-USING-100-percent-CPU-and-not-responding-after-a-while-tp4021359p4083234.html Sent from

Handling categories( level one and two) based navigation

2013-08-08 Thread payalsharma
Hi All, Our web application (e commerce ) requires primary and secondary categories in items. Based on this requirement I have following queries : 1) How category and subcategory are handled in solr version 4.4. I have used apache-solr-1.3.0 previously, but facets have undergone many big

Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
I have a documnet that contains the following data car { id: guid name: string sku: listbarcode } Now, The barcodes dont have a pattern. It can be either one of the follwings: ABCD-EF34GD-JOHN ABCD-C08-YUVF I want to index my documents so that search for 1. ABCD will return

Storing A Taxonomy In A Separate Core For Query Expansion

2013-08-08 Thread Michael Delaney
Hi Guys, Our application contains two data sets: - Items - Item taxonomy What i'm trying to do is allow our admin team to maintain a taxonomy which allows for items to be found easier. For example: if someone searched for 'dog', and the taxonomy contained 'Spaniel' as a narrower form of dog,

Re: Solr 4.4 Default shard

2013-08-08 Thread Prasi S
this was only with solr 4.4.I didnt face the issue in any other versions. On Thu, Aug 8, 2013 at 4:23 PM, Prasi S prasi1...@gmail.com wrote: Initially i created a single collection, -java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost localhost:2181 -confdir

Re: Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
2. notes 1. My current query is similiar to this http://127.0.0.1:8983/solr/vault/select?q=ABCDqf=Name+SKUdefType=edismax 2. I want it to be case insensitive On Thu, Aug 8, 2013 at 2:52 PM, Mysurf Mail stammail...@gmail.com wrote: I have a documnet that contains the following data car {

RE: new field type - enum field

2013-08-08 Thread Elran Dvir
Hi all, Did anyone have a chance to look at the code? It's attached here: https://issues.apache.org/jira/browse/SOLR-5084 Thank you very much. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 29, 2013 2:15 PM To: solr-user@lucene.apache.org

Enabling DIH breaks Solr4.4

2013-08-08 Thread Spadez
Hi, I'm a bit stuck here. I had Solr4.4 working without too many issues. I wanted to enable the DIH so I firstly added these lines to the solrconfig.xml: lib dir=../contrib/dataimporthandler/lib regex=.*\.jar / lib dir=../dist/ regex=apache-solr-dataimporthandler-.*\.jar / Restarted and

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Rafał Kuć
Hello! Try changing the lib directives to absolute paths, by looking at the exception: Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler it seems that the DataImportHandler class is not seen by Solr class loader. -- Regards, Rafał Kuć Sematext

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Raymond Wiker
I think the problem is that you have the wrong name for the jar file: you have apache-solr-dataimporthandler instead of simply solr-dataimporthandler. In my solrconfig.xml, I have lib dir=../../../dist/ regex=solr-dataimporthandler-.*\.jar / --- which may or may not work for you.

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Spadez
Thank you both so much for your help. The regex was indeed outdated. Everything works perfectly now! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Enabling-DIH-breaks-Solr4-4-tp4083282p4083286.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Error loading class 'solr.ISOLatin1AccentFilterFactory'

2013-08-08 Thread Parul Gupta(Knimbus)
Hey ... It works for me! Thanks a lot!!! -- View this message in context: http://lucene.472066.n3.nabble.com/Error-loading-class-solr-ISOLatin1AccentFilterFactory-tp4083012p4083289.html Sent from the Solr - User mailing list archive at Nabble.com.

how to sort by frequency of values on a specific field?

2013-08-08 Thread Luca Incrocci
I'm working with Java and SolrJ on Eclipse. How can I sort the results of a SolrQuery by occurrency of values on a certain field? For example, when I search top n articles (docType=0) of a particular author I want to sort query results by frequency of values in the journal_facet field (type

Solr4.4 DIH Headache

2013-08-08 Thread Spadez
Hi, QUESTION 1 First things first, for the dataimport handler. Is it correct that when I visit it from the admin panel it takes me to this URL: *http://x.com:8080/solr/#/collection1/dataimport//dataimport * When I visit it on this page, it seems to load my config correctly in the right panel.

Re: how to sort by frequency of values on a specific field?

2013-08-08 Thread Jack Krupansky
Sounds like faceting on a field. Or, how is it not like faceting? -- Jack Krupansky -Original Message- From: Luca Incrocci Sent: Thursday, August 08, 2013 8:40 AM To: solr-user@lucene.apache.org Subject: how to sort by frequency of values on a specific field? I'm working with Java

Re: Solr4.4 DIH Headache

2013-08-08 Thread Raymond Wiker
On Aug 8, 2013, at 15:57 , Spadez james_will...@hotmail.com wrote: Hi, QUESTION 1 First things first, for the dataimport handler. Is it correct that when I visit it from the admin panel it takes me to this URL: *http://x.com:8080/solr/#/collection1/dataimport//dataimport * When I

Re: How to parse multivalued data into single valued fields?

2013-08-08 Thread eShard
Ok, I have one index called Communities from an RSS feed. each item in the feed has multiple titles (which are all the same for this feed) So, the title needs to be cleaned up before it is put into the community index let's call the field community_title; And then an UpdateProcessorChain needs to

Re: Suggest aka autocomplete request handler with solr 4.4

2013-08-08 Thread Vinícius
if correctSpelled is true, then appl was found in solr index. In this case, maybe the EnglishMinimalStemFilterFactory filter in text_general fieldType is messing your suggestion. On 6 August 2013 15:33, Utkarsh Sengar utkarsh2...@gmail.com wrote: Jack/Chris, 1. This is my complete

Re: JSON Update create different copies of the same document

2013-08-08 Thread Yonik Seeley
On Thu, Aug 8, 2013 at 4:58 AM, Bruno René Santos brunor...@gmail.com wrote: I thought that by adding a new document with the same id on Solr the document already on Solr would be updated with the new info. Yes, this should be the case. But both documents appear on the search results... How

Re: JSON Update create different copies of the same document

2013-08-08 Thread Jack Krupansky
Either your uniqueKey field values are in fact unique for those separate documents, or your have overwrite=false on the input documents. -- Jack Krupansky -Original Message- From: Bruno René Santos Sent: Thursday, August 08, 2013 4:58 AM To: solr-user@lucene.apache.org Subject: JSON

Re: Solr4.4 DIH Headache

2013-08-08 Thread Stefan Matheis
First things first, for the dataimport handler. Is it correct that when I visit it from the admin panel it takes me to this URL: *http://x.com:8080/solr/#/collection1/dataimport//dataimport * That one is correct. the trailing /dataimport is the name of the handler you've defined.

Re: Transform data at index time: country - continent

2013-08-08 Thread Walter Underwood
SynonymFilter may have a keepOrig flag. If so, that would map countries to continents and not keep the country names. filter class=solr.SynonymFilterFactory synonyms=continents.txt keepOrig=false / wunder On Aug 8, 2013, at 4:10 AM, Christian Köhler - ZFMK wrote: Hi, I have thought

Re: Solr - how do I index barcode

2013-08-08 Thread Andre Bois-Crettez
I would go with a tokenizer to split each character as a separate token. (maybe https://cwiki.apache.org/confluence/display/solr/Tokenizers#Tokenizers-RegularExpressionPatternTokenizer can do) Add a LowerCaseFilterFactory so that casing is ignored. Untested :

Solr search on a large text field is very slow

2013-08-08 Thread meena.sri...@mathworks.com
Index size is around 150 GB and there are around 6.5 million documents in the index. Search on a specific text field is very slow, it takes 1 minute to 2 minute for wildcard queries like *test* with no highlighting and no facets This field contributes to 90% of index size. This is my shema.xml

Re: Filtering suggestion results

2013-08-08 Thread Erick Erickson
Suggester just looks at the terms in the field you point it to, there's no way that I know of to do what you're asking Best Erick On Wed, Aug 7, 2013 at 4:40 PM, rohitrmd rohitmdeshpa...@gmail.com wrote: Hi, I have question regarding suggester component. Can we filter suggestion

Re: Solr search on a large text field is very slow

2013-08-08 Thread Erick Erickson
Sometimes bolding comes through my e-mail as *, so is *test* with the asterisk on each end really what you're doing? Assuming so, this will inevitably be slow. It must iterate through all the terms in the field to see if any of them match. This is generally a bad practice. You can to go n-grams

Re: Solr search on a large text field is very slow

2013-08-08 Thread Rafał Kuć
Hello! In general, wildcard queries can be expensive. If you need wildcard queries, you can try the EdgeNGram - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch

Re: Solr search on a large text field is very slow

2013-08-08 Thread Aloke Ghoshal
Compare timings in the following cases: - Without the wildcard - With suffix wild card only - test* - With reverse wild card filter factory and two separate terms - *test OR test* On Thu, Aug 8, 2013 at 8:15 PM, meena.sri...@mathworks.com meena.sri...@mathworks.com wrote: Index size is around

Re: First Indexing a postgres database

2013-08-08 Thread Erick Erickson
What errors come out in the log (catalina.out) when you try to create cores? What is the full stack trace when you get your error? Is the error from the log or from the client? Details matter. Best Erick On Thu, Aug 8, 2013 at 4:00 AM, geoport tb.rost...@gmail.com wrote: Hi, i am using solr

Re: Question about soft commit and updateRequestProcessorChain

2013-08-08 Thread Erick Erickson
Well, in the Solr4 world, there's another option. Do a hard commit with openSearcher=false. That will guarantee that the documents are durably written because it'll close the segment. What it will NOT do is make the documents searchable, you need to do a soft commit to make that happen. The other

SolrCloud deletedDocs

2013-08-08 Thread Rasmussen, Chris
I'm running a 4.2 SOLRCloud instance with multiple servers/shards. As I'm indexing data, I review the results of the STATUS commands and note an extremely high number of deletedDocs. I've combed through the source data to verify whether I'm sending duplicate documents ids, but haven't been

Re: SolrCloud deletedDocs

2013-08-08 Thread Shawn Heisey
On 8/8/2013 10:47 AM, Rasmussen, Chris wrote: I'm running a 4.2 SOLRCloud instance with multiple servers/shards. As I'm indexing data, I review the results of the STATUS commands and note an extremely high number of deletedDocs. I've combed through the source data to verify whether I'm

Re: Transform data at index time: country - continent

2013-08-08 Thread Jack Krupansky
(I think you're better off with an update processor script, but...) The synonym filter supports 2.5 modes: 1. Replace mode country = continent 2. Expand mode country, continent - results in both terms if either is used 2.5) The expand=false attribute that means treat expand mode as replace

Re: Solr search on a large text field is very slow

2013-08-08 Thread meena.sri...@mathworks.com
Thanks for your responses, helped me to understand the issue. Digged through the documentation and now I am implementing EdgeNGramFilterFactory to see how fastly can I improve wild card searches. fieldType name=type_wildcard class=solr.TextField analyzer type=index

Re: Problems with distributed MoreLikeThis

2013-08-08 Thread Shawn Heisey
On 8/6/2013 9:43 PM, manju16832003 wrote: I'm not sure about the root cause in your case. However one thing to remember while MLT is that, *MLT does not work with integer fields*. In your case if 'catchall' is copyField and if you are trying to copy any integer values verify it again :-). The

Re: Problems with distributed MoreLikeThis

2013-08-08 Thread Shawn Heisey
On 8/6/2013 1:18 PM, Shawn Heisey wrote: I'm having some problems with distributed MLT. On 4.4, it seems completely broken. Searches that work on 4.2.1 return an exception on 4.4.0. This stackoverflow post shows the EarlyTerminatingCollectorException I'm getting:

Re: Suggest aka autocomplete request handler with solr 4.4

2013-08-08 Thread Utkarsh Sengar
HI Chris, You were right, appl was matched to application. So, I created a new type without the stemmer. New type: fieldType name=text_spell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/

Re: Handling categories( level one and two) based navigation

2013-08-08 Thread Erick Erickson
You really haven't told us much about _how_ you want to use category and subcategory. Can a document belong to one or more subcategories? What have you tried? What use-case do you want to support? Etc. You might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Thu, Aug 8,

Re: Storing A Taxonomy In A Separate Core For Query Expansion

2013-08-08 Thread Erick Erickson
No reason why not. You don't even have to keep it in a separate core if you have some field in your docs that had values items or taxonomy you could always use the appropriate fq when querying... You could also to this from the app, query the taxonomy core and do the substitutions in the app

Re: Solr search on a large text field is very slow

2013-08-08 Thread Erick Erickson
If you really need wildcards on both ends, I don't think edgengram will work, you need plain ngrams. The trick is that the query side needs to make the 2-grams into phrases. BTW, I think you'd be fine with bigrams. Best Erick On Thu, Aug 8, 2013 at 1:47 PM, meena.sri...@mathworks.com

RE: external zookeeper with SolrCloud

2013-08-08 Thread Joshi, Shital
We did quite a bit of testing and we think bug https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4 -Original Message- From: Joshi, Shital [Tech] Sent: Wednesday, August 07, 2013 2:48 PM To: 'solr-user@lucene.apache.org' Subject: RE: external zookeeper with

Re: [POLL] Who how does use admin-extra ?

2013-08-08 Thread Michael Sokolov
On 8/7/13 10:56 PM, Chris Hostetter wrote: : Didn't somebody once say this is used for customization of admin pages? it can be yes, that's why it originla existed -- Stefan's question was wether anyone was actually using it for that. I used it quite a bit back in the day at CNET as a way to

Re: external zookeeper with SolrCloud

2013-08-08 Thread Shawn Heisey
On 8/8/2013 3:03 PM, Joshi, Shital wrote: We did quite a bit of testing and we think bug https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4 The commit for SOLR-4899 was made to branch_4x on June 10th. lucene_solr_4_4 code branch was created from branch_4x on July

Re: Percolate feature?

2013-08-08 Thread Mark
Ok forget the mention of percolate. We have a large list of known keywords we would like to match against. Product keyword: Sony Product keyword: Samsung Galaxy We would like to be able to detect given a product title whether or not it matches any known keywords. For a keyword to be

Problem with SolrCloud + Zookeeper + DataImportHandler

2013-08-08 Thread 兴涛孙
hello,guys: I've encounted a problem about configuring many cluested nodes with solr,so i want to ask for your help,thanks in advance! The problems lists as follows: 1.Installation platform: solr4.3.1,zookeeper 3.4.5 and tomcat 7 with jdk1.7 2.when i configured single node with DIH to build