Re: Solr french search optimisation

2013-05-22 Thread It-forum
Hello again, Is any one could help me, plase David Le 22/05/2013 18:09, It-forum a écrit : Hello to all, I'm trying to setup solr 4.2 to index and search into french content. I defined a special fieldtype for french content : positionIncrementGap="100">

OPENNLP current patch compiling problem for 4.x branch

2013-05-22 Thread Patrick Mi
Hi, I checked out from here http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and downloaded the latest patch LUCENE-2899-current.patch. Applied the patch ok but when I did 'ant compile' I got the following error: == [javac] /home/lucene_solr_4_3_0/lucene/analysis/opennlp/sr

Re: search filter

2013-05-22 Thread Kamal Palei
Looks I am getting exception as below May 22, 2013 10:52:11 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NumberFormatException: For input string: "[3 TO 9] OR salary:0" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.L

Re: Sorting solr search results using multiple fields

2013-05-22 Thread Rohan Thakur
k thanks gora that what I was thinking but thought I should ask as there might be some solution in solr it self...the last option fine I would go with that only. thanks rohan On Thu, May 23, 2013 at 9:13 AM, Gora Mohanty wrote: > On 22 May 2013 19:37, Rohan Thakur wrote: > [...] > > this in

Re: search filter

2013-05-22 Thread Kamal Palei
HI Rafał Kuć I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO 10] OR Salary:0 both, both the cases I retrieved 0 results. I use drupal along with solr, my code looks as below. * if($include_0_salary == 1) { $conditions['fq'][0]

Re: [ANNOUNCE] Web Crawler

2013-05-22 Thread Rajesh Nikam
Hi, crawl anywhere seems to using old versions of java, tomcat, etc. http://www.crawl-anywhere.com/installation-v300/ Will it work with new versions of these required software ? Is there updated installation guide available ? Thanks Rajesh On Wed, May 22, 2013 at 6:48 PM, Dominique Bejean

Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-22 Thread Umesh Prasad
Hi Shawn, Thanks for the advice :). The JVM heap Size usage on indexer machine has been consistency about 95% (both total and old gen) for past 3 days. It might have nothing to do with Solr 3.6 Vs solr 4.2 .. Because Solr 3.6 indexer gets restarted once in 2-3 days. Will investigate why

Capturing document processing in solr4.2

2013-05-22 Thread gpssolr2020
Hi , We are creating a unique id using 4-5 fields using SingnatureUpdateProcessorFactory. And we are getting count mismatch b/w data source and solr though we dont have any duplicate records in data source. So we want to capture the id in the log file during its generation to track the mismatch. I

Re: Sorting solr search results using multiple fields

2013-05-22 Thread Gora Mohanty
On 22 May 2013 19:37, Rohan Thakur wrote: [...] > this inital_boost is basically copy field of autosug but saved using > different analysers taking whole sentence as single token and generating > edge ngrams so that what I search on this field only term matching from > first will match...and for

Re: Fast faceting over large number of distinct terms

2013-05-22 Thread Walter Underwood
I would fetch the term vectors for the top N documents and add them up myself. You could even scale the term counts by the relevance score for the document. That would avoid problems with analyzing ten documents where only the first three were really good matches. I did something similar in a d

Re: Storing and retrieving json

2013-05-22 Thread William Bell
I solved this: https://issues.apache.org/jira/browse/SOLR-4685 To get the field in there from XMl to JSON: https://issues.apache.org/jira/browse/SOLR-4692 EnjoY! On Wed, May 22, 2013 at 6:03 PM, Karthick Duraisamy Soundararaj < karthick.soundara...@gmail.com> wrote: > Hello all, >

Approach to apply full index from master to slaves?

2013-05-22 Thread William Bell
We have a 3GB index. We index on the master and then replicate to the slaves. But the issue is that after the slaves switch over - we get deadlocking, # of threads increase to 500, and most times the SOLR instance just plain locks up. We tried adding a bunch of warming queries, but we still have

Invitation to use Google Talk

2013-05-22 Thread Google Talk
--- You've been invited by William Bell to use Google Talk. If you already have a Google account, login to Gmail and accept this chat invitation: http://mail.google.com/mail/b-b903d6c361-e2c748e395-QRdDLp11StDJD0VtxfBDfdnCO6w To

Re: Fast faceting over large number of distinct terms

2013-05-22 Thread Otis Gospodnetic
Here's a possibility: At index time extract important terms (and/or phrases) from this story_text and store top N of them in a separate field (which will be much smaller/shorter). Then facet on that. Or just retrieve it and manually parse and count in the client if that turns out to be faster. I

Re: Low Priority: Lucene Facets in Solr?

2013-05-22 Thread William Bell
It would be beneficial. Lucene facets are really fast without caching and are what I call v2 since the drill sideways also adds capabilities. On Wed, May 22, 2013 at 8:41 PM, Brendan Grainger < brendan.grain...@gmail.com> wrote: > Thanks Jack, no urgency here. I'm unsure that it would even be

Re: Fast faceting over large number of distinct terms

2013-05-22 Thread David Larochelle
The goal of the system is to obtain data that can be used to generate word clouds so that users can quickly get a sense of the aggregate contents of all documents matching a particular query. For example, a user might want to see a word cloud of all documents discussing 'Iraq' in a particular new p

Question about Coord factor

2013-05-22 Thread Kazuaki Hiraga
Hello Folks, Sorry, my last email was a bit messy, so I am sending it again. I have a question about coordination factor to ensure my understanding of this value is correct. If I have documents that contain some keywords like the following:   Doc1: A, B, C   Doc2: A, C   Doc3: B, C And my query

Re: Low Priority: Lucene Facets in Solr?

2013-05-22 Thread Brendan Grainger
Thanks Jack, no urgency here. I'm unsure that it would even be easy/beneficial to integrate into solr, but I'm definitely interested in it. Brendan On Wed, May 22, 2013 at 7:00 PM, Jack Krupansky wrote: > The topic has come up, but nobody has expressed a sense of urgency. > > It actually has a

Re: Fast faceting over large number of distinct terms

2013-05-22 Thread Brendan Grainger
Hi David, Out of interest, what are you trying to accomplish by faceting over the story_text field? Is it generally the case that the story_text field will contain values that are repeated or categorize your documents somehow? From your description: "story_text is used to store free form text obt

How to query docs with an indexed polygon field in java?

2013-05-22 Thread kevenz
hi, I'm using solr 4.3, I have indexed docs with a polygon field, and I'd like to search the polygon docs according to the given point. I've put the jts-1.13.jar into the WEB-INF/lib directory, and I've added the doc to solr successfully. I'm new to lucene and solr, I'm not sure how to query index

Question about Coordination factor

2013-05-22 Thread Kazuaki Hiraga
Hello Folks, I have a question about coordination factor to ensure my understanding of this value is correct. If I have documents that contain some keywords like the following:Doc1: A, B, CDoc2: A, CDoc3: B, C And my query is "A OR B OR C OR D". In this case, Coord factor value for each document

Re: List of Solr Query Parsers

2013-05-22 Thread Roman Chyla
Hello, I have just created a new JIRA issue, if you are interested in trying out the new query parser, please visit: https://issues.apache.org/jira/browse/LUCENE-5014 Thanks, roman On Mon, May 6, 2013 at 5:36 PM, Jan Høydahl wrote: > Added. Please try editing the page now. > > -- > Jan Høydahl,

Re: Storing and retrieving json

2013-05-22 Thread Jack Krupansky
Yes, the quotes need to be escaped - since they are contained within a quoted string, which you didn't show. That is the proper convention for representing strings in JSON. Are you familiar with the JSON format? If not, try XML - it won't have to represent a string as a quoted JSON string. If

Storing and retrieving json

2013-05-22 Thread Karthick Duraisamy Soundararaj
Hello all, I am facing a need to store and retrieve json string in a field. eg. Imagine a schema like below. [Please note that this is just an example but not actual specification.] carDescription is a json string . An example would be { "model":1988 "type":"manual"} I d

Using alternate Solr index location for SolrCloud

2013-05-22 Thread Kevin Osborn
Our prod environment is going to be on Azure. As such, I want our index to live on the Azure VM's local storage rather than the default VM disk (blob storage). Normally, I just use /var/opt/tomcat7/PORT/solr/collection1/data, but I want to use something else. I am also using the Collections API t

Re: Low Priority: Lucene Facets in Solr?

2013-05-22 Thread Jack Krupansky
The topic has come up, but nobody has expressed a sense of urgency. It actually has a placeholder Jira: https://issues.apache.org/jira/browse/SOLR-4774 Feel free to add your encouragement there. -- Jack Krupansky -Original Message- From: Brendan Grainger Sent: Wednesday, May 22, 2013

Re: Scheduling DataImports

2013-05-22 Thread Alexandre Rafalovitch
On first, the cron job that hits the DIH trigger URL will probably be the easiest way. Not sure I understood the second question. How do you store/know that the entries expire. And how do you pull for those specific entries? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn

Low Priority: Lucene Facets in Solr?

2013-05-22 Thread Brendan Grainger
Hi All, Not really a pressing need for this at all, but having worked through a few tutorials, I was wondering if there was any work being done to incorporate Lucene Facets into solr: http://lucene.apache.org/core/4_3_0/facet/org/apache/lucene/facet/doc-files/userguide.html Brendan

fq & facet on double and non-indexed field

2013-05-22 Thread gpssolr2020
Hi i am trying to apply filtering on non-indexed double field .But its not returning any results. So cant we do fq on non-indexed field? can not use FieldCache on a field which is neither indexed nor has doc values: EXCH_RT_AMT 400 We are using Solr4.2. Thanks. -- View this message in cont

Re: Tool to read Solr4.2 index

2013-05-22 Thread gpssolr2020
Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Tool-to-read-Solr4-2-index-tp4065448p4065453.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Tool to read Solr4.2 index

2013-05-22 Thread Shreejay
This might help http://wiki.apache.org/solr/LukeRequestHandler -- Shreejay Nair Sent from my mobile device. Please excuse brevity and typos. On Wednesday, May 22, 2013 at 13:47, gpssolr2020 wrote: > Hi All, > > We can use lukeall4.0 for reading Solr3.x index . Like that do we have > anything

AW: Date Field

2013-05-22 Thread Benjamin Kern
How is the format of utc string? Example? thx -Ursprüngliche Nachricht- Von: Chris Hostetter [mailto:hossman_luc...@fucit.org] Gesendet: Mittwoch, 22. Mai 2013 00:03 An: solr-user@lucene.apache.org Betreff: Re: Date Field : 2) Chain TemplateTransformer either by itself or before the : Da

Tool to read Solr4.2 index

2013-05-22 Thread gpssolr2020
Hi All, We can use lukeall4.0 for reading Solr3.x index . Like that do we have anything to read solr 4.x index. Please help. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Tool-to-read-Solr4-2-index-tp4065448.html Sent from the Solr - User mailing list archive at

Re: Regular expression in solr

2013-05-22 Thread Lance Norskog
If the indexed data includes positions, it should be possible to implement ^ and $ as the first and last positions. On 05/22/2013 04:08 AM, Oussama Jilal wrote: There is no ^ or $ in the solr regex since the regular expression will match tokens (not the complete indexed text). So the results yo

Re: Regular expression in solr

2013-05-22 Thread Furkan KAMACI
API doc says that: Lucene supports regular expression searches matching a pattern between forward slashes "/". The syntax may change across releases, but the current supported syntax is documented in the RegExp class. For example to find documents containing "moat" or "boat": /[mb]oat/ I think th

Re: Russian stopwords

2013-05-22 Thread igiguere
I'm encountering the same issue, but, my Russian stopwords.txt IS encoded in UTF-8. I verified the encoding using EmEditor (I've used it for years, and I use it for the existing English, French, Spanish, Portuguese and German Solr configurations, without issues). Just to make extra sure, I downloa

RE: solr starting time takes too long

2013-05-22 Thread Zhang, Lisheng
Very sorry about hijacking existing thread (I thought it would be OK if I just change the title and content, but still wrong). It will never happen again. Lisheng -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, May 22, 2013 11:58 AM To: solr-us

Scheduling DataImports

2013-05-22 Thread smanad
Hi, I am new to Solr and recently started exploring it for search/sort needs in our webapp. I have couple of questions as below, (I am using solr 4.2.1 with default core named collection1) 1. We have a use case where we would like to index data every 10 mins (avg). Whats the best way to schedule

Re: Boosting Documents

2013-05-22 Thread Chris Hostetter
: NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml) for : any fields where the index-time boost should be stored. : : In my case where I only need to boost the whole document (not a specific : field), do I have to activate the << omitNorms="false" >> for all the fields : in

Re: MoreLikeThis - No Results

2013-05-22 Thread Andy Pickler
Answered my own question... mlt.mintf: Minimum Term Frequency - the frequency below which terms will be ignored in the source doc Our "source doc" is a set of limited terms...not a large content field. So in our case I need to set that value to 1 (rather than the default of 2). Now I'm getting

RE: Slow Highlighter Performance Even Using FastVectorHighlighter

2013-05-22 Thread Andy Brown
After taking your advice on profiling, I didn't see any memory issues. I wanted to verify this with a small set of data. So I created a new sandbox core with the exact same schema and config file settings. I indexed only 25 PDF documents with an average size of 2.8 MB, the largest is approx 5 MB (3

Re: hostname -> ipaddress change in solr4.0 to solr4.1+

2013-05-22 Thread Shawn Heisey
On 5/22/2013 12:53 PM, Anirudha Jadhav wrote: Logging/UI used to show hostname in 4.0 in 4.1+ it switched to ip addresses is this by design or a bug/side effect ? If you are talking about SolrCloud, this was an intentional change. By including a host property either on the Solr startup comma

Re: solr starting time takes too long

2013-05-22 Thread Chris Hostetter
: Subject: solr starting time takes too long : In-Reply-To: <519c6cd6.90...@smartbit.be> : Thread-Topic: shard splitting https://people.apache.org/~hossman/#threadhijack -Hoss

hostname -> ipaddress change in solr4.0 to solr4.1+

2013-05-22 Thread Anirudha Jadhav
Logging/UI used to show hostname in 4.0 in 4.1+ it switched to ip addresses is this by design or a bug/side effect ? its pretty painful to look at ip addresses, I am planning to change. let me know if you have any concerns -- Anirudha

MoreLikeThis - No Results

2013-05-22 Thread Andy Pickler
I'm a developing a recommendation feature in our app using the MoreLikeThisHandler , and so far it is doing a great job. We're using a user's "competency keywords" as the MLT field list and the user's corresponding document in Solr as the "compariso

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Shawn Heisey
On 5/22/2013 11:25 AM, Justin Babuscio wrote: On your overflow theory, why would this impact the client? Is is possible that a write attempt to Solr would block indefinitely while the Solr server is running wild or in a bad state due to the overflow? That's the general notion. I could be comp

Re: Solr Faceting doesn't return values.

2013-05-22 Thread samabhiK
When I use your query, I get : 400 12 true mm_state_code true *mm_state_code:(**TX)* 1369244078714 all sa_site_city xml org.apache.solr.search.SyntaxError: Cannot parse '*mm_state_code:(**TX)*': Encountered " ":" ": "" at line 1, column 14. Was exp

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Justin Babuscio
Shawn, Thank you! Just some quick responses: On your overflow theory, why would this impact the client? Is is possible that a write attempt to Solr would block indefinitely while the Solr server is running wild or in a bad state due to the overflow? We attempt to set the BinaryRequestWriter b

RE: Speed up import of Hierarchical Data

2013-05-22 Thread O. Olson
Just an update for others reading this thread: I had some CachedSqlEntityProcessor and had it addressed in the thread How do I use CachedSqlEntityProcessor? (http://lucene.472066.n3.nabble.com/How-do-I-use-CachedSqlEntityProcessor-td4064919.html) I basically had to declare the child entities in th

Re: shard splitting

2013-05-22 Thread Yago Riveiro
You will need to edit it manually and upload using a zookeeper client, you can use kazoo, it's very easy to use. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Wednesday, May 22, 2013 at 10:04 AM, Arkadi Colson wrote: > clusterstate.json is now reporting shard3 as

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread O. Olson
Thank you guys, particularly James, very much. I just imported 200K documents in a little more than 2 mins – which is great for me :-). Thank you Stefan. I did not realize that it was not a syntax error and hence no error. Thank you for clearing that up. O. O. -- View this message in context:

Re: Solr Faceting doesn't return values.

2013-05-22 Thread Sandeep Mestry
>From the response you've mentioned it appears to me that the query term TX is searched against sa_site_city instead of mm_state_code. Can you try your query like below: http://xx.xx.xx.xx/solr/collection1/select?q=*mm_state_code:(**TX)* &wt=xml&indent=true&facet=true&facet.field=sa_site_city&debu

Re: filter query by string length or word count?

2013-05-22 Thread Jason Hellman
Sam, I would highly suggest counting the words in your external pipeline and sending that value in as a specific field. It can then be queried quite simply with a: wordcount:{80 TO *] (Note the { next to 80, excluding the value of 80) Jason On May 22, 2013, at 11:37 AM, Sam Lee wrote: > I

Can anyone explain this Solr query behavior?

2013-05-22 Thread Shankar Sundararaju
This query returns 0 documents: *q=(+Title:() +Classification:() +Contributors:() +text:())* This returns 1 document: *q=doc-id:3000* And this returns 631580 documents when I was expecting 0: *q=doc-id:3000 AND (+Title:() +Classification:() +Contributors:() +text:())* Am I missing something here

Re: filter query by string length or word count?

2013-05-22 Thread Sandeep Mestry
I doubt if there is any straight out of the box feature that supports this requirement, you will probably need to handle this at the index time. You can play around with Function Queries http://wiki.apache.org/solr/FunctionQuery for any such feature. On 22 May 2013 16:37, Sam Lee wrote: > I ha

Re: Solr Faceting doesn't return values.

2013-05-22 Thread samabhiK
Thanks for your reply. I have my request url modified like this: http://xx.xx.xx.xx/solr/collection1/select?q=TX&df=mm_state_code&wt=xml&indent=true&facet=true&facet.field=sa_site_city&debug=all Facet Filed = sa_site_city ( city wise facet) Default Filed = mm_state_code Query= TX When I run this

Solr french search optimisation

2013-05-22 Thread It-forum
Hello to all, I'm trying to setup solr 4.2 to index and search into french content. I defined a special fieldtype for french content : positionIncrementGap="100"> mapping="mapping-ISOLatin1Accent.txt"/> gene

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread Dyer, James
That would be a worthy enhancement to do. Always nice to give the user a warning when something is going to fail so they can troubleshoot better... James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Stefan Matheis [mailto:matheis.ste...@gmail.com] Sent: Wednesday,

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Shawn Heisey
On 5/22/2013 9:08 AM, Justin Babuscio wrote: We periodically rebuild our Solr index from scratch. We have built a custom publisher that horizontally scales to increase write throughput. On a given rebuild, we will have ~60 JVMs running with 5 threads that are actively publishing to all Solr mas

Re: Solr Faceting doesn't return values.

2013-05-22 Thread Sandeep Mestry
Hi There, Not sure I understand your problem correctly, but is 'mm_state_code' a real value or is it field name? Also, as Erick pointed out above, the facets are not calculated if there are no results. Hence you get no facets. You have mentioned which facets you want but you haven't mentioned whi

filter query by string length or word count?

2013-05-22 Thread Sam Lee
I have schema.xml ... how can I query docs whose body has more than 80 words (or 80 characters) ?

Re: [custom data structure] aligned dynamic fields

2013-05-22 Thread Jack Krupansky
Although we are entering the era of "Big Data", that does not mean there are no limits or restrictions on what a given technology can do. Maybe you need to consider either a smaller scope for your project, or more limited features, or some other form of simplification. Solr can do "billions"

Re: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread Stefan Matheis
> I am curious why I did not get any errors before. Because there was no (syntax) error before - the fact that you didn't include a SKU (but using it as cacheKey) just doesn't match anything .. therefore you got nothing added to your documents. Perhaps we should add an ticket as improvement for

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread O. Olson
Thank you very much James. Your suggestion worked exactly! I am curious why I did not get any errors before. For others, the following worked for me: Similarly for other Categories i.e. Category2, Category3, etc. I am now going to try

Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Justin Babuscio
*Problem:* We periodically rebuild our Solr index from scratch. We have built a custom publisher that horizontally scales to increase write throughput. On a given rebuild, we will have ~60 JVMs running with 5 threads that are actively publishing to all Solr masters. For each thread, we instanti

Re: too many boolean clauses

2013-05-22 Thread Shawn Heisey
> Now regarding the maxBooleanClauses - how it effects performance (response > times, memory usage) when increasing it? Changing maxBooleanClauses doesn't make any difference at all. Having thousands of clauses is what makes things run slower and take more memory. The setting just causes large que

Re: too many boolean clauses

2013-05-22 Thread adm1n
first of all thanks for response! Regarding two tokenizers - it's ok. switching to NGramFilterFactory didn't help (though I didn't reindex but don't think it was needed since switched it into 'query' section). Now regarding the maxBooleanClauses - how it effects performance (response times, memor

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread Dyer, James
There was a mistake in my last reply. Your child entities need to SELECT on the join key so DIH has it to do the join. So use "SELECT SKU, CategoryName..." James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: O. Olson [mailto:olson_...@yahoo.it] Sent: Tuesday, May

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread O. Olson
Thank you bbarani. Unfortunately, this does not work. I do not get any exception, and the documents import OK. However there is no Category1, Category2 … etc. when I retrieve the documents. I don’t think I am using the Alpha or Beta of 4.0. I think I downloaded the plain vanilla release version.

Re: Sorting solr search results using multiple fields

2013-05-22 Thread Rohan Thakur
thanks gora I got that one more thing what actually I have done is made document consisting of fields: { "autosug":"galaxy", "query_id":1414, "pop":168, "initial_boost":"galaxy" "_version_":1435669695565922305, "score":1.8908522} this inital_boost i

Re: too many boolean clauses

2013-05-22 Thread Shawn Heisey
On 5/22/2013 6:43 AM, adm1n wrote: > SyntaxError: Cannot parse > 'name:Bbbbm' The subject mentions one error, the message says another. If you are getting too many boolean clauses, then you need to increase the maxBooleanC

Re: search filter

2013-05-22 Thread Rafał Kuć
Hello! You can try sending a filter like this fq=Salary:[5+TO+10]+OR+Salary:0 It should work -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Dear All > Can I write a search filter for a field having a value in a range or a > specific value.

Re: Solr Faceting doesn't return values.

2013-05-22 Thread samabhiK
Ok my bad. I do have a default field defined in the /select handler in the config file. explicit 10 sa_property_id But then how do I change my query now? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Faceting-doesn-t-return-values-tp4065276p

Re: Crawl Anywhere -

2013-05-22 Thread Dominique Bejean
Hi, Crawl-Anywhere includes a customizable document processing pipeline. Crawl-Anywhere can also cache original crawled pages and documents in a mongodb database. Best regards. Dominique Le 11/02/13 06:16, SivaKarthik a écrit : Dear Erick, Thanks for ur relpy.. ya..nutch can meet m

search filter

2013-05-22 Thread Kamal Palei
Dear All Can I write a search filter for a field having a value in a range or a specific value. Say if I want to have a filter like 1. Select profiles with salary 5 to 10 or Salary 0. So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc. It should be possible, can somebody help m

Re: [ANNOUNCE] Web Crawler

2013-05-22 Thread Dominique Bejean
Hi, I did see this message (again). Please, use the new dedicated Crawl-Anywhere forum for your next questions. https://groups.google.com/forum/#!forum/crawl-anywhere Did you solve your problem ? Thank you Dominique Le 29/01/13 09:28, SivaKarthik a écrit : Hi, i resolved the issue "Acc

Re: Solr Faceting doesn't return values.

2013-05-22 Thread samabhiK
Ok after I added debug=all to the query, I get: { "responseHeader":{ "status":0, "QTime":11, "params":{ "facet":"true", "indent":"true", "q":"mm_state_code", "debug":"all", "facet.field":"sa_site_city", "wt":"json"}}, "response":{"numFound":0,"st

Re: [ANNOUNCE] Web Crawler

2013-05-22 Thread Dominique Bejean
Hi, Crawl-Anywhere is now open-source - https://github.com/bejean/crawl-anywhere Best regards. Le 02/03/11 10:02, findbestopensource a écrit : Hello Dominique Bejean, Good job. We identified almost 8 open source web crawlers http://www.findbestopensource.com/tagged/webcrawler I don't kno

Re: Crawl Anywhere -

2013-05-22 Thread Dominique Bejean
Hi, I didn't see this question. Yes, I confirm Crawl-Anywhere can crawl in distributed environment. If you have several huge web sites to crawl, you can dispatch crawling across several crawler engines. However, one single web site can only be crawled by one crawler engine at a time. This lim

Re: Sorting solr search results using multiple fields

2013-05-22 Thread Gora Mohanty
On 22 May 2013 18:26, Rohan Thakur wrote: > hi all > > I wanted to know is there a way I can sort the my documents based on 3 > fields > I have fields like pop(which is basically frequency of the term searched > history) and autosug(auto suggested words) and initial_boost(copy field of > autosug s

Re: setting the collection in cloudsolrserver without using setdefaultcollection.

2013-05-22 Thread Shawn Heisey
On 5/21/2013 11:20 PM, mike st. john wrote: > Is there any way to set the collection without passing setDefaultCollection > in cloudsolrserver? > > I'm using cloudsolrserver with spring, and would like to autowire it. It's a query parameter: http://wiki.apache.org/solr/SolrCloud#Distributed_Requ

Sorting solr search results using multiple fields

2013-05-22 Thread Rohan Thakur
hi all I wanted to know is there a way I can sort the my documents based on 3 fields I have fields like pop(which is basically frequency of the term searched history) and autosug(auto suggested words) and initial_boost(copy field of autosug such that only match with initial term match having whole

too many boolean clauses

2013-05-22 Thread adm1n
I got: SyntaxError: Cannot parse 'name:Bbbbm' Using solr 4.21 name field type def:

Re: Solr Faceting doesn't return values.

2013-05-22 Thread Erick Erickson
Probably you're not querying the field you think you are. Try adding &debug=all to the URL and I think you'll see something like default_search_field:mm_state_code Which means you're searching for the literal phrase "mm_state_code" in your default search field (defined in solrconfig.xml for the h

Re: synonym indexing in solr

2013-05-22 Thread Erick Erickson
Look at the "text_general" type (solr 4.x) in the example schema.xml. That has an example of including synonyms at index time (although it it commented out, but you can get the idea). So to substitute synonyms at index time, just uncomment the index time analyzer mention of synonyms and comment out

Re: solr starting time takes too long

2013-05-22 Thread Erick Erickson
Zhang: In 3.6, there's really no choice except to load all the cores on startup. 10 minutes still seems excessive, do you perhaps have a heavy-weight firstSearcher query? Yes, soft commits are 4.x only, so that's not your problem. There's a shareSchema option that tries to only load 1 copy of th

Re: ShingleFilterFactory

2013-05-22 Thread Erick Erickson
Seems to me like shingles will work for you. To your questionsl 1> not really, phrases are just how you get the single token through the parser. Escaping the spaces would work, as term1\ term2 2> This is just a standard negation, i.e. q=-field:term1\ term2 3> This works if you specify minShingleSiz

Solr Faceting doesn't return values.

2013-05-22 Thread samabhiK
Hello, I have a field defined in my schema.xml like so: string is a type : When I run the query for faceting data by the city: http://XX.XX.XX.XX/solr/collection1/select?q=mm_state_code&wt=json&indent=true&facet=true&facet.field=sa_site_city I get empty result like so: { "responseHeade

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-22 Thread Sandeep Mestry
Thanks Erick for your suggestion. Turns out I won't be going that route after all as the highlighter component is quite complicated - to follow and to override - and not much time left in hand so did it the manual (dirty) way. Beat Regards, Sandeep On 22 May 2013 12:21, Erick Erickson wrote:

Re: [Solr 4.2.1] LotsOfCores - Can't query cores with loadOnStartup="true" and transient="true"

2013-05-22 Thread Erick Erickson
Thanks, I saw that and assigned it to myself. On the original form when you create the issue, there's an "assign to" entry field, but I don't know whether you see the same thing Best Erick On Wed, May 22, 2013 at 5:36 AM, Lyuba Romanchuk wrote: > Hi Erick, > > I opened an issue in JIRA: SOLR

Re: Upgrade Solr index from 4.0 to 4.2.1

2013-05-22 Thread Erick Erickson
LUCENE_40 since your original index was built with 4.0. As for the other, I'll defer to people who actually know what they're talking about. Best Erick On Wed, May 22, 2013 at 5:19 AM, Elran Dvir wrote: > My index is originally of version 4.0. My methods failed with this > configuration. >

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-22 Thread Erick Erickson
Sandeep: You need to be a little careful here, I second Shawn's comment that you are mixing versions. You say you are using solr 4.0. But the jar that ships with that is apache-solr-core-4.0.0.jar. Then you talk about using solr-core, which is called solr-core-4.1.jar. Maven is not officially sup

Re: Regular expression in solr

2013-05-22 Thread Oussama Jilal
There is no ^ or $ in the solr regex since the regular expression will match tokens (not the complete indexed text). So the results you get will basicly depend on your way of indexing, if you use the regex on a tokenized field and that is not what you want, try to use a copy field wich is not t

RE: synonym indexing in solr

2013-05-22 Thread Sagar Chaturvedi
Thanks. Already used it. Quite easy to setup. But it tells how to setup Synonym search. I am asking about synonym indexing. -Original Message- From: Oussama Jilal [mailto:jilal.ouss...@gmail.com] Sent: Wednesday, May 22, 2013 4:18 PM To: solr-user@lucene.apache.org Subject: Re: synonym i

Re: Boosting Documents

2013-05-22 Thread Oussama Jilal
Ok thank you for your help, I think I will have to treat the problem in another way even if it will complicate things for me. thanks again On 05/22/2013 11:51 AM, Sandeep Mestry wrote: I'm running out of options now, can't really see the issue you're facing unless the debug analysis is posted.

Re: Regular expression in solr

2013-05-22 Thread Stéphane Habett Roux
I just can't get the $ endpoint to work. > I am not sure but I heard it works with the Java Regex engine (a little > obvious if it is true ...), so any Java regex tutorial would help you. > > On 05/22/2013 11:42 AM, Sagar Chaturvedi wrote: >> Yes, it works for me too. But many times result is no

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry
I'm running out of options now, can't really see the issue you're facing unless the debug analysis is posted. I think a thorough debugging is required from both application and solr level. If you want a customize scoring from Solr, you can also consider overriding DefaultSimilarity implementation

Re: [custom data structure] aligned dynamic fields

2013-05-22 Thread Dmitry Kan
Jack, Thanks for your response. 1. Flattening could be an option, although our scale and required functionality (runtime non DocValues backed facets) is beyond what solr3 can handle (billions of docs). We have flattened the meta data at the expense of "over"-generating solr documents. But to solv

Re: synonym indexing in solr

2013-05-22 Thread Oussama Jilal
Hello, I think that what is written about the SynonymFilterFactory in the wiki is well explained, so I will direct you there : http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory On 05/22/2013 11:44 AM, Sagar Chaturvedi wrote: Hi, Since synonym searching ha

Re: Regular expression in solr

2013-05-22 Thread Oussama Jilal
I am not sure but I heard it works with the Java Regex engine (a little obvious if it is true ...), so any Java regex tutorial would help you. On 05/22/2013 11:42 AM, Sagar Chaturvedi wrote: Yes, it works for me too. But many times result is not as expected. Is there some guide on use of regex

  1   2   >