date:20110616

about the SolrServer server = new CommonsHttpSolrServer(URL);

2011-06-16 Thread Ranveer

Dear all, I am using SolrServer server = new CommonsHttpSolrServer(URL); through out the class. How can I improve the connection, in my case: should I need to close the server after fetching the result or CommonsHttpSolrServer(URL); will maintain at their end. There is other way: I can make

Re: SOlR -- Out of Memory exception

2011-06-16 Thread pravesh

If you are sending whole CSV in a single HTTP request using curl, why not consider sending it in smaller chunks? -- View this message in context: http://lucene.472066.n3.nabble.com/SOlR-Out-of-Memory-exception-tp3074636p3075091.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-16 Thread Sujatha Arun

Alexey, Do you mean that we have current Index as it is and have a separate core which has only the user-id ,product-id relation and at while querying ,do a join between the two cores based on the user-id. This would involve us to Index/delete the product as and when the user subscription for

omitTermFreqAndPositions in a TextField fieldType

2011-06-16 Thread Michael Ryan

Is it possible to use omitTermFreqAndPositions="true" in a declaration that uses class="solr.TextField"? I've tried doing this and it does not seem to work (i.e., the prx file size does not change). Using it in a declaration does work, but I'd rather set it in the so I don't have to repeat i

Re: SOlR -- Out of Memory exception

2011-06-16 Thread jyn7

Yes Eric, after changing the lock type to Single, I got an OOM after loading 5.5 million records. I am using the curl command to upload the csv. -- View this message in context: http://lucene.472066.n3.nabble.com/SOlR-Out-of-Memory-exception-tp3074636p3074765.html Sent from the Solr - User mailin

Re: SOlR -- Out of Memory exception

2011-06-16 Thread Erick Erickson

H, are you still getting your OOM after 7M records? Or some larger number? And how are you using the CSV uploader? Best Erick On Thu, Jun 16, 2011 at 9:14 PM, jyn7 wrote: > We just started using SOLR. I am trying to load a single file with 20 million > records into SOLR using the CSV uploade

Re: Document Scoring

2011-06-16 Thread Erick Erickson

I really wouldn't go there, it sounds like there are endless opportunities for errors! How "real-time" is "real-time"? Could you fix this entirely by 1> adjusting expectations for, say, 5 minutes. 2> adjusting your commit (on the master) and poll (on the slave) appropriately? Best Erick On Thu,

Re: Boost Strangeness

2011-06-16 Thread Erick Erickson

Right, if you've only changed WordDelimiterFilterFactory in the query, then then tokens you're analyzing may be split up. Try running some of the terms through the admin/analysis page Unless you have "catenateAll=1", in the definition, the whole term won't be there It becomes a question of

Re: fieldCache problem OOM exception

2011-06-16 Thread Erick Erickson

Well, if my theory is right, you should be able to generate OOMs at will by sorting and faceting on all your fields in one query. But Lucene's cache should be garbage collected, can you take some memory snapshots during the week? It should hit a point and stay steady there. How much memory are yo

SOlR -- Out of Memory exception

2011-06-16 Thread jyn7

We just started using SOLR. I am trying to load a single file with 20 million records into SOLR using the CSV uploader. I keep getting and out of Memory after loading 7 million records. Here is the config: 1 6 I also encountered a LockObtainFailedException org.apache.

Re: Encoding of alternate fields in highlighting

2011-06-16 Thread Koji Sekiguchi

(11/06/17 0:15), Massimo Schiavon wrote: I have an index with various fields and I want to highlight query matchings on "title" and "content" fields. These fields could contain html tags so I've configured HtmlFormatter for highlighting. The problem is that if the query doesn't match the text o

sending results of function query to range query

2011-06-16 Thread Kevin Osborn

I am not sure if I can use function queries this way. I have a query like this"attributeX:[* TO ?]" in my DB. I replace the ? with input from the front end. Obviously, this works fine. However, what I really want to do is "attributeX:[* TO (3 * ?)]" Is there anyway to embed the results of a func

Re: getting started

2011-06-16 Thread Sascha SZOTT

Hi Mari, it depends ... * How many records are stored in your MySQL databases? * How often will updates occur? * How many db records / index documents are changed per update? I would suggest to start with a single Solr core first. Thereby, you can concentrate on the basics and do not need to d

Re: getting started

2011-06-16 Thread Jonathan Rochkind

On 6/16/2011 4:41 PM, Mari Masuda wrote: One reservation I have is that eventually we would like to be able to type in "Iraq" and find records across all of the collections at once instead of having to search each collection separately. Although I don't know anything about it at this stage, I

Re: It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Robert Muir

On Thu, Jun 16, 2011 at 3:23 PM, Gabriele Kahlout wrote: >> I'm trying to assess the impact of coord (search-time) on Qtime. In one > implementation coord returns 1, while in another it's actually computed. On query time? coord should be really cheap (unless your impl does something like calcula

getting started

2011-06-16 Thread Mari Masuda

Hello, I am new to Solr and am in the beginning planning stage of a large project and could use some advice so as not to make a huge design blunder that I will regret down the road. Currently I have about 10 MySQL databases that store information about different archival collections. For exam

Re: Strange behavior

2011-06-16 Thread Denis Kuzmenok

Of course, i did stop the solr before copying the index. Deleting index and reindexing on production server did solve an issue. Strange, but working.. > Have you stopped Solr before manually copying the data? This way you > can be sure that index is the same and you didn't have any new docs

RE: How to index correctly a text save with tinyMCE

2011-06-16 Thread Steven A Rowe

Hi Ariel, As Shawn says, char filters come before tokenizers. You need to use a tag instead of tag. I've updated the HTMLStripCharFilter documentation on the Solr wiki to include this information: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory St

Re: It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Gabriele Kahlout

On Thu, Jun 16, 2011 at 9:14 PM, Erik Hatcher wrote: > No, there's not a way to control Similarity on a per-request basis. > > Some factors from Similarity are computed at index-time though. > You got me on this. > > What factors are you trying to tweak that way and why? Maybe doing > boosting

Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Andrzej Bialecki

On 6/16/11 5:31 PM, Mark Schoy wrote: Thanks for your answers. Andrzej was right with his assumption. Solr only needs about 9GB memory but the system needs the rest of it for disc IO: 64 Cores: 64*100MB index size = 6,4GB + 9 GB Solr Cache + about 600 MB OS = 16GB Conclusion: My system can ex

Re: It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Erik Hatcher

No, there's not a way to control Similarity on a per-request basis. Some factors from Similarity are computed at index-time though. What factors are you trying to tweak that way and why? Maybe doing boosting using some other mechanism (boosting functions, boosting clauses) would be a better

RE: HTMLStripTransformer will remove the content in XML??

2011-06-16 Thread Chris Hostetter

FYI: There's a new patch specificly for dealing with xml tags and entities that handles the CDATA case... https://issues.apache.org/jira/browse/SOLR-2597 : Date: Fri, 27 May 2011 17:01:26 +0800 : From: Ellery Leung : Reply-To: solr-user@lucene.apache.org, elleryle...@be-o.com : To: solr-user@l

Re: Minimum Should Match + External Field + Function Query with boost

2011-06-16 Thread Chris Hostetter

: Seem to have a solution but I am still trying to figure out how/why it works. : : Addition of "defType=edismax" in the boost query seem to honor "MM" and : correct boosting based on external file source. You didn't bost enough details in your original question to be 100% certain (would have

It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Gabriele Kahlout

Hello, I'm testing out different Similarity implementations, and to do that I restart Solr each time I want to try a different similarity class I change the class attributed of the similiary element in schema.xml. Beside running multiple-cores, each with its own schema, is there a way to tell the

Re: Updating only one indexed field for all documents quickly.

2011-06-16 Thread Alexey Serba

>> with the integer field. If you just want to influence the >> score, then just plain external field fields should work for >> you. > > Is this an appropriate solution, give our use case? > Yes, check out ExternalFileField * http://search.lucidimagination.com/search/document/CDRG_ch04_4.4.4 * ht

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T

Ah! That was the problem. The version was 1.0. I'll change it to 1.2. Thanks! -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, June 16, 2011 2:33 PM To: Simon, Richard T Cc: solr-user@lucene.apache.org Subject: RE: getFieldValue always retu

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Chris Hostetter

: We haven't changed Solr versions. We've been using 3.1.0 all along. but that's not what i'm talking about. I'm talking about the "schema version" ... a specific property declared in your schema.xml file. did you check it? (even when people start with Solr X, they sometimes are using schema.

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-16 Thread Sujatha Arun

Peter , Thanks for the clarification. Why I specifically asked was because, we have many search instances (200+) on a single JVM. Each of these instaces could have users and each user can subscribe to products .Now accordng to your suggestion , I need to maintain an in-memory list of all

Re: Strange behavior

2011-06-16 Thread Alexey Serba

Have you stopped Solr before manually copying the data? This way you can be sure that index is the same and you didn't have any new docs on the fly. 2011/6/14 Denis Kuzmenok : > What should i provide, OS is the same, environment is the same, solr > is completely copied, searches work, excep

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T

We haven't changed Solr versions. We've been using 3.1.0 all along. Plus, I have some code that runs during indexing and retrieves the fields from a SolrInputDocument, rather than a SolrDocument. That code gets Strings without any problem, and always has, even without saying multiValued="false".

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Chris Hostetter

: and all of a sudden I get Strings. But, doesn't multivalued default to : false? In my schema, I originally did not set multivalued. I only put in : multivalued="false" after I experienced this issue. That's dependent on the version of Solr, and it's is where the "version" property of the sch

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Shawn Heisey

On 6/16/2011 11:12 AM, Ariel wrote: Thanks for your answer, I have just put the filter in my schema.xml but it doesn't work I am using solr 1.4 and my conf is: But it doesn't work in tomcat 6 logs I get this error: java.lang.ClassCastException: org.

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Ariel

Thanks for your answer, I have just put the filter in my schema.xml but it doesn't work I am using solr 1.4 and my conf is: But it doesn't work in tomcat 6 logs I get this error: java.lang.ClassCastException: org.apache.solr.analysis.HTMLStripCharFilterFactor

RE: How to index correctly a text save with tinyMCE

2011-06-16 Thread Steven A Rowe

Hi Ariel, On 6/16/2011 at 10:45 AM, Ariel wrote: > I have the following problem: I am using the spanish analyzer to index > and query, but due to I am using tinymce some charactes of the text are > changed codified in html, for example the text: "En españa ... " it is > changed to "En españa" so I

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-16 Thread Alexey Serba

> So a search for a product once the user logs in and searches for only the > products that he has access to Will translate to something like this . ,the > product ids are obtained form the db for a particular user and can run > into n number. > > &fq=product_id(100 10001 ..n number) > > b

Document Scoring

2011-06-16 Thread zarni aung

Hi, I am designing my indexes to have 1 write-only master core, 2 read-only slave cores. That means the read-only cores will only have snapshots pulled from the master and will not have near real time changes. I was thinking about adding a hybrid read and write master core that will have the mos

Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Mark Schoy

Thanks for your answers. Andrzej was right with his assumption. Solr only needs about 9GB memory but the system needs the rest of it for disc IO: 64 Cores: 64*100MB index size = 6,4GB + 9 GB Solr Cache + about 600 MB OS = 16GB Conclusion: My system can exactly buffer the data of 64 Cores. Every

Re: Showing facet of first N docs

2011-06-16 Thread karsten-solr

Hi Tommaso, the FacetComponent works with the DocListAndSet#docSet. It should be easy to switch to DocListAndSet#docList (which contains all documents for result list (default: TOP-10, but possible 15-25 (if start=15, rows=11). Which means to change the source code. Instead of changing the sour

Re: Complex situation

2011-06-16 Thread Alexey Serba

Am I right that you are only interested in results / facets for current season? If it's so then you can index start/end dates as a separate number fields and build your search filters like this "fq=+start_date_month:[* TO 6] +start_date_day:[* TO 17] +end_date_month:[* TO 6] +end_date_day:[16 TO *]

Encoding of alternate fields in highlighting

2011-06-16 Thread Massimo Schiavon

I have an index with various fields and I want to highlight query matchings on "title" and "content" fields. These fields could contain html tags so I've configured HtmlFormatter for highlighting. The problem is that if the query doesn't match the text of the field, solr returns the value of con

Re: query routing with shards

2011-06-16 Thread Dmitry Kan

Hi Otis, I have fixed it by assigning the value to rb same as assigned to sreq: rb.shards = shards.toString().split(","); not tested that fully yet, but distributed faceting works at least on my pc _3 shards 1 router_ setup. Dmitry On Thu, Jun 16, 2011 at 4:53 PM, Dmitry Kan wrote: > Hi Ot

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Ariel

I have the following problem: I am using the spanish analyzer to index and query, but due to I am using tinymce some charactes of the text are changed codified in html, for example the text: "En españa ... " it is changed to "En españa" so I need a way to recodify that text to make queries correctl

Re: Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili

Thanks Dmitry, but maybe I didn't explain correctly as I am not sure facet.offset is the right solution, I'd like not to page but to filter facets. I'll try to explain better with an example. Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc' as values for field 'lemmas' w

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T

FYI: Using multiValued="false" for all string fields results in the following output: ### Field uri is an instance of String. ### Field entity_label is an instance of String. ### Field institution_uri is an instance of String. ### Field asserted_type_uri is an instance of String.

Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread François Schiettecatte

I am assuming that you are running on linux here, I have found atop to be very useful to see what is going on. http://freshmeat.net/projects/atop/ dstat is also very useful too but needs a little more work to 'decode'. Obviously there is contention going on, you just need to figure out

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T

Interesting. You guessed right. I changed "multivalued" to "multiValued" and all of a sudden I get Strings. But, doesn't multivalued default to false? In my schema, I originally did not set multivalued. I only put in multivalued="false" after I experienced this issue. -Rich For the record, I

Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Andrzej Bialecki

On 6/16/11 3:22 PM, Mark Schoy wrote: Hi, I set up a Solr instance with 512 cores. Each core has 100k documents and 15 fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM. Now I've done some benchmarks with JMeter. On each thread iteration JMeter queriing another Core by random.

Re: Showing facet of first N docs

2011-06-16 Thread Dmitry Kan

http://wiki.apache.org/solr/SimpleFacetParameters facet.offset This param indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili wrote: > Hi all,

Re: Boost Strangeness

2011-06-16 Thread Judioo

fascinating Thank you so much Erik, I'm slowly beginning to understand. SO I've discovered that by defining 'splitOnNumerics="0"' on the filter class 'solr.WordDelimiterFilterFactory' ( for ONLY the query analyzer ) I can get *closer* to my required goal! Now something else odd is occuring.

Re: query routing with shards

2011-06-16 Thread Dmitry Kan

Hi Otis, I followed your recommendation and decided to implement the SearchComponent::modifyRequest(ResponseBuilder rb, SearchComponent who, ShardRequest sreq) method, where the query routing happens. So far it is working OK for the non-facet search, this is good news. The bad news is that it fail

Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Mark Schoy

Hi, I set up a Solr instance with 512 cores. Each core has 100k documents and 15 fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM. Now I've done some benchmarks with JMeter. On each thread iteration JMeter queriing another Core by random. Here are the results (Duration: each w

Complex situation

2011-06-16 Thread roySolr

Hello, First i will try to explain the situation: I have some companies with openinghours. Some companies has multiple seasons with different openinghours. I wil show some example data : Companyid Startdate(d-m) Enddate(d-m) Openinghours_end 101-01

Re: Mahout & Solr

2011-06-16 Thread Adam Estrada

You're right...It would be nice to be able to see the cluster results coming from Solr though... Adam On Thu, Jun 16, 2011 at 3:21 AM, Andrew Clegg wrote: > Well, it does have the ability to pull TermVectors from an index: > > > https://cwiki.apache.org/MAHOUT/creating-vectors-from-text.html#Cr

Re: DIH abort doesn't close datasources

2011-06-16 Thread Shalin Shekhar Mangar

On Thu, Jun 16, 2011 at 3:46 PM, Frank Wesemann wrote: > Shalin, > thank you for the answer. > I indeed didn't look into clearCache(). > I thought it would just do that ( clear caches ). :) > Yeah, it is not the most aptly named method :) Thanks for reviewing the code though! -- Regards, Shali

RE: Field Collapsing and Grouping in Solr 3.2

2011-06-16 Thread Sergio Martín

Mike, thanks a lot for your quick and precise answer! Sergio Martín Cantero playence KG Penthouse office Soho II - Top 1 Grabenweg 68 6020 Innsbruck Austria Mobile: (+34)654464222 eMail: sergio.mar...@playence.com Web:www.playence.com Stay up to date on the latest developments of playe

Re: Field Collapsing and Grouping in Solr 3.2

2011-06-16 Thread Michael McCandless

Alas, no, not yet.. grouping/field collapse has had a long history with Solr. There were many iterations on SOLR-236, but that impl was never committed. Instead, SOLR-1682 was committed, but committed only to trunk (never backported to 3.x despite requests). Then, a new grouping module was facto

Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili

Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso

Re: DIH abort doesn't close datasources

2011-06-16 Thread Frank Wesemann

Shalin, thank you for the answer. I indeed didn't look into clearCache(). I thought it would just do that ( clear caches ). :) Shalin Shekhar Mangar schrieb: The abort command just sets a atomic boolean flag which is checked frequently by the import threads to see if they should stop. If you loo

Field Collapsing and Grouping in Solr 3.2

2011-06-16 Thread Sergio Martín

Hello. Does anybody know if Field Collapsing and Grouping is available in Solr 3.2. I mean directly available, not as a patch. I have read conflicting statements about it... Thanks a lot! Description: playence Sergio Martín Cantero playence KG Pent

RE: Multiple indexes

2011-06-16 Thread Kai Gülzau

Are there any plans to support a kind of federated search in a future solr version? I think there are reasons to use seperate indexes for each document type but do combined searches on these indexes (for example if you need separate TFs for each document type). I am aware of http://wiki.apache.or

Re: DIH abort doesn't close datasources

2011-06-16 Thread Shalin Shekhar Mangar

On Wed, Jun 15, 2011 at 8:10 PM, Frank Wesemann wrote: > Hi, > I just came across this: > If I abort an import via /dataimport/?command=abort the connections to the > (in my case) database stay open. > Shouldn't DocBuilder#rollback() call something like cleanup() which in turn > tries to close Ent

Re: Copying few field using copyField to non multiValued field

2011-06-16 Thread Michael Kuhlmann

Hi Omri, there are two limitations: 1. You can't sort on a multiValued field. (Anyway, on which of the copied fields would you want to sort first?) 2. You can't make the multiValued field the unique key. Both are no real limitations: 1. Better sort on at_country, at_state, at_city instead. 2. Sim

Re: fieldCache problem OOM exception

2011-06-16 Thread Bernd Fehling

Hi Erik, yes I'm sorting and faceting. 1) Fields for sorting: sort=f_dccreator_sort, sort=f_dctitle, sort=f_dcyear The parameter "facet.sort=" is empty, only using parameter "sort=". 2) Fields for faceting: f_dcperson, f_dcsubject, f_dcyear, f_dccollection, f_dclang, f_dctypenorm, f_d

63 matches

Mail list logo