Re: solr 1.4 highlighting issue

2011-09-15 Thread Dmitry Kan
Koji, This looks strange to me, because I would assume, that highlighter also applies boolean logic same way as a query parser. In this way of thinking drilling should be highlighted if ships occurred together in the same document. Which wasn't the case in the example. Dmitry On Wed, Sep 14,

Re: solr 1.4 highlighting issue

2011-09-15 Thread Dmitry Kan
Hi Mike, Actually, the example I gave is the document in this case. So there was no ships, only drilling. Dmitry On Wed, Sep 14, 2011 at 1:59 PM, Michael Sokolov soko...@ifactory.comwrote: The highlighter gives you snippets of text surrounding words (terms) drawn from the query. The whole

Re: math with date and modulo

2011-09-15 Thread stockii
okay, thanks a lot. I thought, that isnt possible to get the month in my case =( i will try out another way. - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores

Re: Norms - scoring issue

2011-09-15 Thread Ahmet Arslan
It seems that fieldNorm difference is coming from the field named 'text'. And you didn't include the definition of text field. Did you omit norms for that field too? By the way I see that you have store=true in some places but it should be store*d*=true. --- On Wed, 9/14/11, Adolfo Castro

Re: Out of memory

2011-09-15 Thread Dmitry Kan
Hello, Since you use caching, you can monitor the eviction parameter on the solr admin page (http://localhost:port/solr/admin/stats.jsp#cache). If it is non zero, the cache can be made e.g. bigger. queryResultWindowSize=50 in my case. Not sure, if solr 3.1 supports, but in 1.4 I have: HashDocSet

Re: Terms.regex performance issue

2011-09-15 Thread tbarbugli
Hi, I do have the same problem, i am looking for infix autocomplete, could you elaborate a bit on your QueryConverter - Suggester solution ? Thank You! -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3338273.html Sent from the Solr -

RE: Out of memory

2011-09-15 Thread Rohit
It's happening more in search and search has become very slow particularly on the core with 69GB index data. Regards, Rohit -Original Message- From: Dmitry Kan [mailto:dmitry@gmail.com] Sent: 15 September 2011 07:51 To: solr-user@lucene.apache.org Subject: Re: Out of memory Hello,

Re: why we need the index information in a database ?

2011-09-15 Thread Gora Mohanty
On Thu, Sep 15, 2011 at 2:53 PM, kiran.bodigam kiran.bodi...@gmail.com wrote: why we need the index information in a database is because it is clusterable. In other words, we may have/need more than one instance of the SOLR engine running. [...] Not sure if you are after multiple instances

Re: Out of memory

2011-09-15 Thread Dmitry Kan
If you have many users you could scale vertically, i.e. do replication. Buf before that you could do sharding, for example by indexing entries based on a hash function. Let's say split 69GB to two shards first and experiment with it. Regards, Dmitry On Thu, Sep 15, 2011 at 12:22 PM, Rohit

Re: Count rows with tokens

2011-09-15 Thread tom135
Facet Indexing is good solution for me :) Thanks for your help! -- View this message in context: http://lucene.472066.n3.nabble.com/Count-rows-with-tokens-tp3274643p3338556.html Sent from the Solr - User mailing list archive at Nabble.com.

can we share the same index directory for multiple cores?

2011-09-15 Thread kiran.bodigam
If we implement the multi core functionality in solr is there any possibility that the same index information shared by two different cores (redundancy),can we share the same index directory for multiple cores?If i query it on admin which core will respond because they suggesting to query on

Distinct elements in a field

2011-09-15 Thread swiss knife
Simple question: I want to know how many distinct elements I have in a field and these verify a query. Do you know if there's a way to do it today in 3.4. I saw SOLR-1814 and SOLR-2242. SOLR-1814 seems fairly easy to use. What do you think ? Thank you

Delete documents with empty fields

2011-09-15 Thread Massimo Schiavon
I want to delete all documents with empty title field. If i run the query -title:[* TO *] I obtain the correct list of documents but when I submit to solr the delete command: curl http://localhost:8080/solr/web/update\?commit=true -H 'Content-Type: text/xml' --data-binary \

Re: Solandra - select query error

2011-09-15 Thread tom135
Hi Jake, I was reproduce example of my error (commit release 3408a30): 1. I have used schema.xml from reuters-demo, with my fields definition: . fields field name=id type=long indexed=true stored=true required=true / field name=text type=text indexed=true stored=true

Re: Delete documents with empty fields

2011-09-15 Thread Ahmet Arslan
I want to delete all documents with empty title field. If i run the query -title:[* TO *] I obtain the correct list of documents but when I submit to solr the delete command: curl http://localhost:8080/solr/web/update\?commit=true -H 'Content-Type: text/xml' --data-binary \

Re: indexing data from rich documents - Tika with solr3.1

2011-09-15 Thread Erik Hatcher
Maybe this quick script will get you running? http://www.lucidimagination.com/blog/2011/08/31/indexing-rich-files-into-solr-quickly-and-easily/ On Sep 15, 2011, at 00:44 , scorpking wrote: Hi Erick Erickson, Now, we have many files format(doc, ppt, pdf, ...), File's purpose serve to

How to write core's name in log

2011-09-15 Thread Joan
Hi, I have multiple core in Solr and I want to write core name in log through to lo4j. I've found in SolrException a method called log(Logger log, Throwable e) but when It try to build a Exception it haven't core's name. The Exception is built in toStr() method in SolrException class, so I want

RE: Out of memory

2011-09-15 Thread Rohit
Thanks Dmitry, let me look into sharading concepts. Regards, Rohit Mobile: +91-9901768202 About Me: http://about.me/rohitg -Original Message- From: Dmitry Kan [mailto:dmitry@gmail.com] Sent: 15 September 2011 10:15 To: solr-user@lucene.apache.org Subject: Re: Out of memory If you

Replication and ExternalFileField

2011-09-15 Thread Per Osbeck
Hi all, I'm trying to find some good information regarding replication, especially for the ExternalFileField. As I understand it; - the external files must be in data dir. - replication only replicates data/indexes and possibly confFiles from the conf dir. Does anyone have suggestions or

Re: Replication and ExternalFileField

2011-09-15 Thread Markus Jelsma
Perhaps a symlink will do the trick. On Thursday 15 September 2011 14:04:47 Per Osbeck wrote: Hi all, I'm trying to find some good information regarding replication, especially for the ExternalFileField. As I understand it; - the external files must be in data dir. - replication only

RE: Replication and ExternalFileField

2011-09-15 Thread Per Osbeck
Probably would have worked on *nix but unfortunately running Windows. Best regards, Per -Original Message- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: den 15 september 2011 14:07 To: solr-user@lucene.apache.org Subject: Re: Replication and ExternalFileField Perhaps a

Re: Delete documents with empty fields

2011-09-15 Thread Massimo Schiavon
On 15/09/2011 13:01, Ahmet Arslan wrote: +*:* -title:[* TO *] Worked fine. Thanks a lot! Massimo

Re: Index not getting refreshed

2011-09-15 Thread Mike Sokolov
Is it possible you have two solr instances running off the same index folder? This was a mistake I stumbled into early on - I was writing with one, and reading with the other, so I didn't see updates. -Mike On 09/15/2011 12:37 AM, Pawan Darira wrote: I am commiting but not doing replication

Re: Terms.regex performance issue

2011-09-15 Thread O. Klein
Read http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html for more info about the QueryConverter. IMO Suggester should make it easier to choose between QueryConverters. As for the infix, WIKI says its planned

Re: Norms - scoring issue

2011-09-15 Thread Adolfo Castro Menna
Hi Ashmet, You're right. It was related to the text field which is the defaultSearch field. I also added omitNorms=true in the fieldtype definition and it's now working as expected Thanks, Adolfo.

Re: OOM issue

2011-09-15 Thread abhijit bashetti
Hi Eric, Thanks for the reply. It is very useful for me. For point 1. : I do need 10 core and it will go on increasing in future. I have document that belongs to different workspaces , so the 1 workspace = 1 core ; I cant go with one core. Currrently having 10 core but in future the count may

Re: Performance troubles with solr

2011-09-15 Thread Yusuf Karakaya
Thank you all for your fast replies, Changing photo_id:* to boolean has_photo field via transformer, when importing data, *fixed my problems*; reducing query times to *30~ ms*. I'll try to optimize furthermore by your advices on filter query usage and int=tint (will search it first) transform.

Re: Schema fieldType y-m-d ?!?!

2011-09-15 Thread stockii
thx =) i think i will save this as an string if ranges really works =) - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests -

Multiple shards on same machine find matches but return 0 results.

2011-09-15 Thread Aliya Virani
Hi, Recently we have been trying to scale up our Solandra setup to make use of a more powerful server. To improve query speeds we tried reducing the index size, and thus increasing the number of shards on a single machine. While we had no trouble searching and return results when we had a single

Re: glassfish, solrconfig.xml and SolrException: Error loading DataImportHandler

2011-09-15 Thread Xue-Feng Yang
Thanks for telling me this issue. However, I would think this is a bug. ^=^ From: Chris Hostetter hossman_luc...@fucit.org To: solr-user@lucene.apache.org solr-user@lucene.apache.org; Xue-Feng Yang just4l...@yahoo.com Sent: Wednesday, September 14, 2011

location of solr folder when deploy to servlet container

2011-09-15 Thread Kiwi de coder
hi, how do i configure the solr folder to specific directory when deploy to servlet container. regards, kiwi

Re: location of solr folder when deploy to servlet container

2011-09-15 Thread Markus Jelsma
In Tomcat you can set an environment var in Solr's context and set your home directory: Environment name=solr/home type=java.lang.String value=/opt/solr/ hi, how do i configure the solr folder to specific directory when deploy to servlet container. regards, kiwi

Re: how would I use the new join feature given my schema.

2011-09-15 Thread Jason Toy
Anyone know the query I would do to get the join to work? I'm unable to get it to work. On Wed, Sep 14, 2011 at 10:49 AM, Jason Toy jason...@gmail.com wrote: I've been reading the information on the new join feature and am not quite sure how I would use it given my schema structure. I have

query for point in time

2011-09-15 Thread gary tam
Hi I have a scenario that I am not sure how to write the query for. Here is the scenario - have an employee record with multi value for project, started date, end date. looks something like John Smith web site bug fix 2010-01-01 2010-01-03

Re: query for point in time

2011-09-15 Thread Jonathan Rochkind
You didn't tell us what your schema looks like, what fields with what types are involved. But similar to how you'd do it in your database, you need to find 'documents' that have a start date before your date in question, and an end date after your date in question, to find the ones whose

Re: Sorting on multiValued fields via function query

2011-09-15 Thread boneill42
Was there a solution here? Is there a ticket related to the sort=max(FIELD) solution? -brian -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-on-multiValued-fields-via-function-query-tp2681833p3340145.html Sent from the Solr - User mailing list archive at

[DIH] How to use combine Regex and HTML transformers

2011-09-15 Thread Pulkit Singhal
Hello, I need to pull out the price and imageURL for products in an Amazon RSS feed. PROBLEM STATEMENT: The following: field column=description xpath=/rss/channel/item/description / field column=price

Re: Can index size increase when no updates/optimizes are happening?

2011-09-15 Thread Yury Kats
On 9/14/2011 2:36 PM, Erick Erickson wrote: What is the machine used for? Was your user looking at a master? Slave? Something used for both? Stand-alone machine with multiple Solr cores. No replication. Measuring the size of all the files in the index? Or looking at memory? Disk space.

Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal
Hello Everyone, I have a goal of populating Solr with a million unique products in order to create a test environment for a proof of concept. I started out by using DIH with Amazon RSS feeds but I've quickly realized that there's no way I can glean a million products from one RSS feed. And I'd go

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Daniel Skiles
I've done it using SolrJ and a *lot *of of parallel processes feeding dummy data into the server. On Thu, Sep 15, 2011 at 4:54 PM, Pulkit Singhal pulkitsing...@gmail.comwrote: Hello Everyone, I have a goal of populating Solr with a million unique products in order to create a test

RE: Replication and ExternalFileField

2011-09-15 Thread Jaeger, Jay - DOT
Actually, Windoze also has symbolic links. You have to manipulate them from the command line, but they do exist. http://en.wikipedia.org/wiki/NTFS_symbolic_link -Original Message- From: Per Osbeck [mailto:per.osb...@lbi.com] Sent: Thursday, September 15, 2011 7:15 AM To:

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Markus Jelsma
If we want to test with huge amounts of data we feed portions of the internet. The problem is it takes a lot of bandwith and lots of computing power to get to a `reasonable` size. On the positive side, you deal with real text so it's easier to tune for relevance. I think it's easier to create

Re: query for point in time

2011-09-15 Thread gary tam
Thanks for the reply. We had the search within the database initially, but it proven to be too slow. With solr we have much better performance. One more question, how could I find the most current job for each employee My data looks like John Smith department A web site bug fix

Re: query for point in time

2011-09-15 Thread Jonathan Rochkind
I think there's something wrong with your database then, but okay. You still haven't said what your Solr schema looks like -- that list of values doesn't say what the solr field names or types are. I think this is maybe because you don't actually have a Solr database and have no idea how Solr

Lucene-SOLR transition

2011-09-15 Thread Scott Smith
I've been using lucene for a number of years. We've now decided to move to SOLR. I have a couple of questions. 1. I'm used to creating Boolean queries, filter queries, term queries, etc. for lucene. Am I right in thinking that for SOLR my only option is creating string queries (with

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal
Ah missing } doh! BTW I still welcome any ideas on how to build an e-commerce test base. It doesn't have to be amazon that was jsut my approach, any one? - Pulkit On Thu, Sep 15, 2011 at 8:52 PM, Pulkit Singhal pulkitsing...@gmail.com wrote: Thanks for all the feedback thus far. Now to get  

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Pulkit Singhal
Thanks for all the feedback thus far. Now to get little technical about it :) I was thinking of feeding a file with all the tags of amazon that yield close to roughly 5 results each into a file and then running my rss DIH off of that, I came up with the following config but something is

ClassCastException: SmartChineseWordTokenFilterFactory to TokenizerFactory

2011-09-15 Thread Xue-Feng Yang
Hi all, I am trying to use SmartChineseWordTokenFilterFactory in solr 3.4.0, but come to the error SEVERE: java.lang.ClassCastException: org.apache.solr.analysis.SmartChineseWordTokenFilterFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory Any thought?

Re: hi. allowLeadingWildcard is it possible or not yet?

2011-09-15 Thread deniz
i wonder the same thing... so wanna re-animate the topic is it possible? - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/hi-allowLeadingWildcard-is-it-possible-or-not-yet-tp495457p3340838.html Sent from the Solr - User mailing

Re: Generating large datasets for Solr proof-of-concept

2011-09-15 Thread Lance Norskog
http://aws.amazon.com/datasets DBPedia might be the easiest to work with: http://aws.amazon.com/datasets/2319 Amazon has a lot of these things. Infochimps.com is a marketplace for free pay versions. Lance On Thu, Sep 15, 2011 at 6:55 PM, Pulkit Singhal pulkitsing...@gmail.comwrote: Ah

Re: ClassCastException: SmartChineseWordTokenFilterFactory to TokenizerFactory

2011-09-15 Thread Lance Norskog
Tokenizers and TokenFilters are different. Look in the schema for how other TokenFilterFactory classes are used. On Thu, Sep 15, 2011 at 8:05 PM, Xue-Feng Yang just4l...@yahoo.com wrote: Hi all, I am trying to use SmartChineseWordTokenFilterFactory in solr 3.4.0, but come to the error