Re: how to deal with virtual collection in solr?

2010-09-03 Thread Jan Høydahl / Cominvent
You did not supply your actual query. Try to add a q=foobar parameter, also you don't need a before shards since you have the ?. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 1. sep. 2010, at 20.14, Ma, Xiaohui

Re: In Need of Direction; Phrase-Context Tracking / Injection (Child Indexes) / Dismissal

2010-09-03 Thread Jan Høydahl / Cominvent
Hi, This smells like a job for Hadoop and perhaps Mahout, unless your use cases are totally ad-hoc research. After Nutch has fetched the sites, kick off some MapReduce jobs for each case you wish to study: 1. Extract phrases/contexts 2. For each context, perform detection and whitelisting 3. In

Re: shingles work in analyzer but not real data

2010-09-03 Thread Jeff Rose
Thanks Steven and Jonathan, we got it working by using a combination of quoting and the PositionFilterFactory, like is shown below. The documentation for the position filter doesn't make much sense without understanding more about how positioning of tokens is taken into account, but it appears to

Re: Hardware Specs Question

2010-09-03 Thread Dennis Gearon
If you really want to see performance, try external DRAM disks. Whew! 800X faster than a disk. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/2/10, Shawn

Index time boosting

2010-09-03 Thread phoey
Hi there, Im having some issues with my relevancy of some results. I have 5 fields, with varying boost values and being copied into a copyfield text which is used to be searched on field name=i_title type=alphaOnlySort indexed=true stored=true omitNorms=false/ field name=i_authors

Re: shingles work in analyzer but not real data

2010-09-03 Thread Jeff Rose
I don't have any fancy links, but from the documentation shingles make pretty good sense. You typically tokenize an input string so that the best apple pie becomes the best apple pie, so that each term can then be filtered to remove stop words, take off plurals and suffixes like ing, etc. The

Re: Hardware Specs Question

2010-09-03 Thread Toke Eskildsen
On Fri, 2010-09-03 at 03:45 +0200, Shawn Heisey wrote: On 9/2/2010 2:54 AM, Toke Eskildsen wrote: We've done a fair amount of experimentation in this area (1997-era SSDs vs. two 15.000 RPM harddisks in RAID 1 vs. two 10.000 RPM harddisks in RAID 0). The harddisk setups never stood a chance

spellcheck distance measure algorithms error ?

2010-09-03 Thread Xavier Schepler
Hi, When I take the two letters from the middle of a word and put the first in place of the second and the second in place of the first, ex : jospin = jopsin, I don't get any suggestion from the spellchecker component. I tryed the default algorithm and the Jaro Winkler Distance, with a

SolrJ and Multi Core Set up

2010-09-03 Thread Shaun Campbell
I'm writing a client using SolrJ and was wondering how to handle a multi core installation. We want to use the facility to rebuild the index on one of the cores at a scheduled time and then use the SWAP facility to switch the live core to the newly rebuilt core. I think I can do the SWAP with

Re: Purpose of SolrDocument.java

2010-09-03 Thread stockii
aaah okay. so its SolrDocument in normal search never been used ? its only for other solr-plugins ? -- View this message in context: http://lucene.472066.n3.nabble.com/Purpose-of-SolrDocument-java-tp1408443p1411276.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Hardware Specs Question

2010-09-03 Thread Toke Eskildsen
On Fri, 2010-09-03 at 11:07 +0200, Dennis Gearon wrote: If you really want to see performance, try external DRAM disks. Whew! 800X faster than a disk. As sexy as they are, the DRAM drives does not buy much more extra performance. At least not at the search stage. For searching, SSDs are not

Re: SolrJ and Multi Core Set up

2010-09-03 Thread Chantal Ackermann
Hi Shaun, you create the SolrServer using multicore by just adding the core to the URL. You don't need to add anything with SolrQuery. URL url = new URL(new URL(solrBaseUrl), coreName); CommonsHttpSolrServer server = new CommonsHttpSolrServer(url); Concerning the default core thing - I wouldn't

Re: stream.url

2010-09-03 Thread satya swaroop
Hi all, I am unable to index the files of remote system that contains escaped characters in their file names i think there is a problem in solr for indexing the files of escaped characters in remote system... Has anybody tried to index the files in remote system that contain the

Re: SolrJ and Multi Core Set up

2010-09-03 Thread Shaun Campbell
Thanks Chantal I hadn't spotted that that's a big help. Thank you. Shaun On 3 September 2010 12:31, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi Shaun, you create the SolrServer using multicore by just adding the core to the URL. You don't need to add anything with

Re: Index time boosting

2010-09-03 Thread Erick Erickson
Have you tried running the queries through with debugQuery=true? It may well be that, for certain documents, the lower-boosted fields are still overwhelming the contribution to scoring from the higher-boosted fields for the documents in question. The problem is that index-time boosting is fairly

Index with ItalianStemmer

2010-09-03 Thread Tommaso Teofili
Hi all, I am experiencing a strange behavior while indexing italian text (an indexed not stored text field) when stemming with italian language: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/

Re: Purpose of SolrDocument.java

2010-09-03 Thread Peter Karich
aaah okay. so its SolrDocument in normal search never been used ? its only for other solr-plugins ? SolrDocument is under org.apache.solr.common which is for the solr-solj.jar and not available for the solr-core.jar see e.g.:

Re: spellcheck distance measure algorithms error ?

2010-09-03 Thread Grant Ingersoll
On Sep 3, 2010, at 6:02 AM, Xavier Schepler wrote: Hi, When I take the two letters from the middle of a word and put the first in place of the second and the second in place of the first, ex : jospin = jopsin, I don't get any suggestion from the spellchecker component. I tryed the

Re: how/why would I use LiteralValueSource and can I create a custom string function?

2010-09-03 Thread Grant Ingersoll
On Sep 2, 2010, at 7:40 PM, Gerald wrote: Thanks Grant Am looking forward to the day when I can create a SOLR URL that looks something like this: http://mysolrserver:8080/solr/select?q=*:* AND mycustomstrfunction(mysolrstrfield):'somestringvalue' AND

Re: Auto Suggest

2010-09-03 Thread Jason Rutherglen
Analysis returns app mou. On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog goks...@gmail.com wrote: What does analysis.jsp show? On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: I'm having a different issue with the EdgeNGram technique described here:

RE: shingles work in analyzer but not real data

2010-09-03 Thread Steven A Rowe
Hi Dennis, I took a stab at answering this question in the following java-user mailing list post: http://www.lucidimagination.com/search/document/6cb7b54cce6872b3/lucene_indexes Steve -Original Message- From: Dennis Gearon [mailto:gear...@sbcglobal.net] Sent: Friday, September 03,

Re: spellcheck distance measure algorithms error ?

2010-09-03 Thread Grant Ingersoll
On Sep 3, 2010, at 9:14 AM, Xavier Schepler wrote: On 03/09/2010 14:47, Grant Ingersoll wrote: On Sep 3, 2010, at 6:02 AM, Xavier Schepler wrote: no, jopsin isn't in the index. I tryed this with other words and I had the same error. Thx for your reply. And what happens if you drop the

Re: Index with ItalianStemmer

2010-09-03 Thread Robert Muir
On Fri, Sep 3, 2010 at 8:04 AM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Does anyone know what could be the root cause or if I am missing something? Thanks in advance for any help, Tommaso I didn't see a definition of your 'query' analyzer, only 'index'. Can you ensure you specify

RE: Does SolrNet support indexing of Database tables and XML files

2010-09-03 Thread Michael Griffiths
First of all, I suggest you ask in the SolrNET group: http://groups.google.com/group/solrnet Second, Solr support both database tables and XML files through the Data Import Handler (DIH). You may wish to configure indexing in Solr, then query via SolrNET. -Original Message- From:

Re: spellcheck distance measure algorithms error ?

2010-09-03 Thread Xavier Schepler
On 03/09/2010 15:31, Grant Ingersoll wrote: On Sep 3, 2010, at 9:14 AM, Xavier Schepler wrote: On 03/09/2010 14:47, Grant Ingersoll wrote: On Sep 3, 2010, at 6:02 AM, Xavier Schepler wrote: no, jopsin isn't in the index. I tryed this with other words and I had the same

Re: Does SolrNet support indexing of Database tables and XML files

2010-09-03 Thread kenf_nc
Alok, I noticed you also posted to the SolrNet forum, and that's a better place for this question. But basically, SolrNet is a wrapper around Solr functionality. It lets you build your Solr interactions (Queries, Stats, Facets, etc) and Inserts/Deletes using .Net objects. The reading of a data

Re: Index time boosting

2010-09-03 Thread phoey
thanks for replying erick, We are currently using dismax but for this particular client we have coupled their implementation to standard parser and will be difficult to switch, although i might just have to bite the bullet for this. which you can do without dismax BTW, although you have to

Re: Auto Suggest

2010-09-03 Thread Jason Rutherglen
To clarify, the query analyzer returns that. Variations such as apple mou also do not return anything. Maybe Jay can comment and then we can amend the article? On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Analysis returns app mou. On Thu, Sep 2, 2010 at

Re: Localsolr with Dismax = workaround using spatial solr

2010-09-03 Thread Luke Tebbs
I finally managed to get spatial searching working in combination with dismax so I'm sending this should anyone else have the same problem. I gave up using localsolr in the end - one of the resultsets of the two it returned was correct (dismax+spatial) but I don't trust this enough to depend

Re: Auto Suggest

2010-09-03 Thread Luke Tebbs
What about if you do something like this? - facet=truefacet.mincount=1q=applefacet.limit=10facet.prefix=moufacet.field=term_suggestqt=basicwt=javabinrows=0version=1 Jason Rutherglen wrote: To clarify, the query analyzer returns that. Variations such as apple mou also do not return anything.

RE: how to deal with virtual collection in solr?

2010-09-03 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Thanks so much for your help, Jan Høydahl. Have a great weekend! Xiaohui -Original Message- From: Jan Høydahl / Cominvent [mailto:jan@cominvent.com] Sent: Friday, September 03, 2010 3:46 AM To: solr-user@lucene.apache.org Subject: Re: how to deal with virtual collection in solr? You

Re: Auto Suggest

2010-09-03 Thread dan sutton
I set this up a few years ago with something like the following: fieldType name=autocomplete class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory /

Re: Hardware Specs Question

2010-09-03 Thread scott chu
well balanced system = Agree. Here we'll start a performance load test this month. I've defined a test criteria of 'qps', 'RTpQ' worse case according to our use case past experience. Our goal is pursuing this criteria adjust hardware system configuration to find a well

Re: how/why would I use LiteralValueSource and can I create a custom string function?

2010-09-03 Thread Gerald
I dont really have any specific use case in mind; I was just wondering what I could (or couldn't) do with custom functions possible reasons for allowing that type of syntax include: 1. in general, to simplify queries, and make them more readable, by eliminating the need for the _val_ hack

Patch to pass a file to the first and new index searchers

2010-09-03 Thread Papiya Misra
I do not want to make the solrconfig.xml huge. I think I saw a patch a couple of weeks back that allowed passing a csv file as a parameter. Can anyone help ? Thanks Papiya Pink OTC Markets Inc. provides the leading inter-dealer quotation and trading system in the over-the-counter (OTC)

Re: Hardware Specs Question

2010-09-03 Thread Dennis Gearon
I wouldn't have thought that CPU was a big deal with the speed/cores of CPU's continuously growing according to Moore's law and the change in Disk Speed barely changine 50% in 15 years. Must have a lot to do with caching. What size indexes are you working with? Are you saying you can get the

Re: false matches with ReversedWildcardFilterFactory

2010-09-03 Thread Yonik Seeley
On Thu, Sep 2, 2010 at 1:10 PM, Landon Kuhn landon9...@gmail.com wrote: Hello, I am using the ReversedWildcardFilterFactory, and I am wondering if there is a way to prevent false matches when a query token matches the reversed indexed token. For instance, the query *zemog* matches documents

Re: false matches with ReversedWildcardFilterFactory

2010-09-03 Thread Robert Muir
On Fri, Sep 3, 2010 at 11:55 AM, Yonik Seeley yo...@lucidimagination.comwrote: Off the top of my head, I'm not sure of an easy way to prevent this. we could fix this in trunk easily for these queries, with intersection/subtraction (e.g. minus \u0001.* from the DFA) -- Robert Muir

Re: Patch to pass a file to the first and new index searchers

2010-09-03 Thread Papiya Misra
Ok - found it - https://issues.apache.org/jira/browse/SOLR-784 . On 09/03/2010 11:51 AM, Papiya Misra wrote: I do not want to make the solrconfig.xml huge. I think I saw a patch a couple of weeks back that allowed passing a csv file as a parameter. Can anyone help ? Thanks Papiya Pink OTC

RE: Do commits block updates in SOLR 1.4?

2010-09-03 Thread Robert Petersen
So you are saying we definitely do not need to pause ADD activity on other threads while we send the COMMIT? And the same goes with AUTOCOMMIT right? We are using SOLR 1.4 now. We were on 1.3 previously. We pretty much just assumed pausing ADDs during COMMITs was required by SOLR when we

Re: Do commits block updates in SOLR 1.4?

2010-09-03 Thread Mark Miller
Solr handles all of this concurrency for you - it's actually even a little too aggressive about that these days, as Lucene has changed a lot - but yes - you can add while committing and commit while adding - Solr will block itself as needed. - Mark On 9/3/10 1:27 PM, Robert Petersen wrote: So

Solr + Katta ... benefits?

2010-09-03 Thread thiseye
I'm investigating using Lucene for a project to index a massive HBase database. I was looking at using Katta to distribute the index because people have said that becomes a limitation with simply using Lucene as the index grows. Then I came across Solr which seems like it would also help this

Re: Throttling replication

2010-09-03 Thread Brandon Evans
On 9/2/10 3:20 PM, Koji Sekiguchi wrote: (10/09/03 5:42), Brandon Evans wrote: On 9/2/10 11:16 AM, Mark wrote: I am using the built in replication. Can you send me a link to the patch so I can give it a try? Thanks This patch looks great! Can you open a jira issue and contribute the

Re: Solr crawls during replication

2010-09-03 Thread Shawn Heisey
On 9/2/2010 9:31 AM, Mark wrote: Thanks for the suggestions. Our slaves have 12G with 10G dedicated to the JVM.. too much? Are the rysnc snappuller featurs still available in 1.4.1? I may try that to see if helps. Configuration of the switches may also be possible. Also, would you mind

Re: In Need of Direction; Phrase-Context Tracking / Injection (Child Indexes) / Dismissal

2010-09-03 Thread Scott Gonyea
I've been considering the use of Hadoop, since that's what Nutch uses. Unless I piggy-back onto Nutch's MR job, when creating a Solr index, I'm wondering if it's overkill. I can see ways of working it into a MapReduce workflow, but it would involve dumping the database onto HDFS beforehand. I'm

Re: Hardware Specs Question

2010-09-03 Thread Shawn Heisey
On 9/3/2010 3:39 AM, Toke Eskildsen wrote: I'll have to extrapolate a lot here (also known as guessing). You don't mention what kind of harddrives you're using, so let's say 15.000 RPM to err on the high-end side. Compared to the 2 drives @ 15.000 RPM in RAID 1 we've experimented with, the

RE: Solr crawls during replication

2010-09-03 Thread Jonathan Rochkind
Is the OS disk cache something you configure, or something the OS just does automatically based on available free RAM? Or does it depend on the exact OS? Thinking about the OS disk cache is new to me. Thanks for any tips. From: Shawn Heisey

Re: Solr crawls during replication

2010-09-03 Thread Mark
On 9/3/10 11:37 AM, Jonathan Rochkind wrote: Is the OS disk cache something you configure, or something the OS just does automatically based on available free RAM? Or does it depend on the exact OS? Thinking about the OS disk cache is new to me. Thanks for any tips.

solr

2010-09-03 Thread ankita shinde
hello, I have done all the suggested changes. My table name is 'info' having columns id,name,city and skill. I am able to index them all successfully. But I am able to search the data only using name and not other column names. Where did I go wrong? *My data-config.xml file is as below:*

Boost, weight, proximity, ranking which one?

2010-09-03 Thread javaxmlsoapdev
I am using solr 1.4 version. I have a requirement where need to show up all documents first which matched most words from the free text search string. e.g. If user was searching for two words with no quotes connectivity breakup my search results should display all documents where both words

RE: Do commits block updates in SOLR 1.4?

2010-09-03 Thread Robert Petersen
Thanks guys! I will be quite happy to remove the unnecessary complexity from our code. -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Friday, September 03, 2010 10:28 AM To: solr-user@lucene.apache.org Subject: Re: Do commits block updates in SOLR 1.4? Solr

Re: solr

2010-09-03 Thread Papiya Misra
What is the query that you are using ? Try something like q=city:Chicago . Look at the solrconfig file and you will see defaultSearchFieldname/defaultSearchField . This is the reason that unless you specify the search field in the query, solr will always search the field name. On 09/03/2010

Re: solr

2010-09-03 Thread ankita shinde
I didn't find defaultSearchFieldname/defaultSearchField in my solrconfig.xml. Do I need to include this tag? On Sat, Sep 4, 2010 at 3:04 AM, Papiya Misra pmi...@pinkotc.com wrote: What is the query that you are using ? Try something like q=city:Chicago . Look at the solrconfig file and you

Re: Solr crawls during replication

2010-09-03 Thread Shawn Heisey
On 9/3/2010 12:37 PM, Jonathan Rochkind wrote: Is the OS disk cache something you configure, or something the OS just does automatically based on available free RAM? Or does it depend on the exact OS? Thinking about the OS disk cache is new to me. Thanks for any tips. Depends on what you

Re: solr user

2010-09-03 Thread Lance Norskog
Naming fields something_t but declaring them string will either not work, or cause confusion. On Thu, Sep 2, 2010 at 6:49 AM, kenf_nc ken.fos...@realestate.com wrote: You are querying for 'branch' and trying to place it in 'skill'. Also, you have Name and Column backwards, it should be:

Re: shingles work in analyzer but not real data

2010-09-03 Thread Lance Norskog
http://en.wikipedia.org/wiki/W-shingling On Fri, Sep 3, 2010 at 6:19 AM, Steven A Rowe sar...@syr.edu wrote: Hi Dennis, I took a stab at answering this question in the following java-user mailing list post: http://www.lucidimagination.com/search/document/6cb7b54cce6872b3/lucene_indexes

Re: shingles work in analyzer but not real data

2010-09-03 Thread Dennis Gearon
Thank you mucho much, Lance. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/3/10, Lance Norskog goks...@gmail.com wrote: From: Lance Norskog