Re: Lock timed out 2 worker running

2009-08-11 Thread Chris Hostetter
: 5) are these errors appearing after Solr crashes and you restart it? : : : Yep, I can't find the logs but it's something like can't obtain lock for : somefile.lck Need to delete that fiile in order to start the solr properly wait ... either you missunderstood my question, or you just

Re: Querying Dynamic Fields.. simple query not working

2009-08-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Aug 11, 2009 at 11:16 AM, Avlesh Singhavl...@gmail.com wrote: Ah! I guessed you were using it this way. I would need to reconfirm this, but there seems to be an inconsistency in fetching data versus adding data via SolrJ w.r.t dynamic fields.

Re: Querying Dynamic Fields.. simple query not working

2009-08-11 Thread Ninad Raut
Hi, SOLR-1129https://issues.apache.org/jira/browse/SOLR-1129 seems to have been solved . Can I apply the patch? 2009/8/11 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com On Tue, Aug 11, 2009 at 11:16 AM, Avlesh Singhavl...@gmail.com wrote: Ah! I guessed you were using it this way. I

build.xml errors

2009-08-11 Thread viorelhojda
Hello. I've downloaded SOLR using the SVN and Eclipse IDE. After setting up the classpath and everything i've managed to have no errrors in the code files. The problem is that the BUILD XML files (build.xml, common-build.xml etc) are full of errros (couple of hundreds), such as: Attribute

Re: How can i get lucene index format version information?

2009-08-11 Thread Licinio Fernández Maurelo
Thanks all for your responses, what i expect to get is the index format version as it appears in luke's overview tab (index format : -9 (UNKNOWN) 2009/7/31 Jay Hill jayallenh...@gmail.com: Check the system request handler: http://localhost:8983/solr/admin/system Should look something like

Re: Querying Dynamic Fields.. simple query not working

2009-08-11 Thread Avlesh Singh
SOLR-1129 was for a different use case, Ninad. I have created an issue for this enhancement - https://issues.apache.org/jira/browse/SOLR-1357 Cheers Avlesh On Tue, Aug 11, 2009 at 12:09 PM, Ninad Raut hbase.user.ni...@gmail.comwrote: Hi,

Re: Retrieving the boost factor using Solrj CommonsHttpSolrServer

2009-08-11 Thread Avlesh Singh
The boost factor is available in the SolrInputDocument, but not in the SolrDocument returned by the SolrServer 'query' method Yes, you are right. There seems to be an inconsistency. And there is no relationship between the SolrInputDocument and the SolrDocument (... which in itself is pretty

Searching for reservations/availability with Solr

2009-08-11 Thread Constantijn Visinescu
Hello, I have a problem i'm trying to solve where i want to check if objects are reserved or not. (by reservation i mean like making a reservation at a hotel, because you would like to stay there on certain dates). I have the following in my schema.xml field name=name type=text indexed=true

faceting/searching on multi-valued fields

2009-08-11 Thread AHMET ARSLAN
I have two parallel multivauled fields for holding key value pairs for each document. doc arr name=value strred/str strother/str strVS/str str10 cm./str str50 GB/str ... /arr arr name=key strColor/str strType/str

Re: Building documents using content residing both in database tables and text files

2009-08-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
isn't it possible to do this by having two datasources (one Js=dbc and another File) and two entities . The outer entity can read from a DB and the inner entity can read from a file. On Tue, Aug 11, 2009 at 8:05 PM, Sascha Szottsz...@zib.de wrote: Hello, is it possible (and if it is, how can

Re: Newbie problem ordering results

2009-08-11 Thread Germán Biozzoli
Sure fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ The strange thing is that I could sort by another fields that is defined using string, but not by another defined as some tokenized field and after that copied as string. I attach the schema.xml for the case is

Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
Hello, is it possible (and if it is, how can I accomplish it) to configure DIH to build up index documents by using content that resides in different data sources? Here is an example scenario: Let's assume we have a table T with two columns, ID (which is the primary key of T) and TITLE.

Re: Multiple Unique Ids

2009-08-11 Thread Shalin Shekhar Mangar
On Mon, Aug 10, 2009 at 2:05 PM, Ninad Raut hbase.user.ni...@gmail.comwrote: Hi, I have two Ids DocumentId and AuthorId. I want both of them unique. Can i have two uniqueKey in my document? uniqueKeyid/uniqueKey uniqueKeyauthorId/uniqueKey No. You can have only one uniqueKey in

Re: faceting/searching on multi-valued fields

2009-08-11 Thread Avlesh Singh
Let's say I have a dynamic field defined as dynamicField name=* type=string Can I use those fields at query time, although they are not defined in schema.xml? Yes. Though I am not sure whether you can create a dynamic field without a prefix of suffix with the wild-card. I would rather

Re: build.xml errors

2009-08-11 Thread Shalin Shekhar Mangar
On Tue, Aug 11, 2009 at 1:31 PM, viorelhojda viorelho...@yahoo.com wrote: Hello. I've downloaded SOLR using the SVN and Eclipse IDE. After setting up the classpath and everything i've managed to have no errrors in the code files. The problem is that the BUILD XML files (build.xml,

Re: Searching for reservations/availability with Solr

2009-08-11 Thread Avlesh Singh
From what I understood, you need a day level granularity (i.e booked on 15th, 16th and 17th of August) in your indexes. If this is true, then why even store a date? For your use case, I think this should suffice - dynamicField name=reserved_dates_* type=integer indexed=true stored=true

Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
I've got an index building with the shingle filter and I can see the compound terms with Luke, etc. So far so good. One detail, I did tell it to not emit unigrams - I've got single words covered in a normal field. And a bit of poking around the other day explained why shingle queries weren't

Re: Searching for reservations/availability with Solr

2009-08-11 Thread Shalin Shekhar Mangar
On Tue, Aug 11, 2009 at 7:08 PM, Constantijn Visinescu baeli...@gmail.comwrote: doc str name=nameRoom1/str date name=reserved_from_112000-08-01T00:00:00Z/date date name=reserved_to_112000-08-31T23:59:59Z/date /doc doc str name=nameRoom2/str date

Re: Newbie problem ordering results

2009-08-11 Thread Avlesh Singh
Ahhh, I should have seen this first. Your contributororder field is multi-valued, you cannot sort on that field. However the RuntimeException that Solr throws has a misleading error message - ... but it's impossible to sort on tokenized fields. The field in this case is untokenized. Cheers

Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
I'm going somewhere with this... be patient. :-) I had asked about this briefly at the SF meetup, but there was a lot going on. 1: Suppose you had Solr 1.4 and all the Carrot^2 DOCUMENT clustering was all in, and you had built the cluster index for all your docs. 2: Then, if you had a

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
With regards my second question, re. More Like this, I do see: The MoreLikeThisHandler can also use a ContentStream to find similar documents. It will extract the interesting terms from the posted text. at http://wiki.apache.org/solr/MoreLikeThisHandler and that it uses the TF/IDF stuff. Still

Re: Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
Hi Noble, Noble Paul wrote: isn't it possible to do this by having two datasources (one Js=dbc and another File) and two entities . The outer entity can read from a DB and the inner entity can read from a file. Yes, it is. Here's my db-data-config.xml file: !-- definition of data sources --

FW: NativeFSLockFactory, ConcurrentMergeScheduler: why locks?

2009-08-11 Thread Fuad Efendi
Most probably I need to play around UpdateHandler(s); I am using DirectUpdateHandler with allowDuplicates = false: solrj.SolrServer.add(docs, overwrite=true) Use case: I have a timestamp on a document; documents in an index get expired by timestamp; same document could be added to the index

ArrayIndexOutOfBounds on Some Searches

2009-08-11 Thread Stephen Duncan Jr
This is with trunk for Solr 1.4. It happened both with a build from 1 week ago as well as with a build from today, so I'm not sure if it's something recent, or even if it would happen on Solr 1.3 or not. Here's the stack trace indicating that a value looping around from Integer.MAX_VALUE to

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Grant Ingersoll
Inline... On Aug 11, 2009, at 12:44 PM, Mark Bennett wrote: I'm going somewhere with this... be patient. :-) I had asked about this briefly at the SF meetup, but there was a lot going on. 1: Suppose you had Solr 1.4 and all the Carrot^2 DOCUMENT clustering was all in, and you had built

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
Thanks Grant. *** mlb: comments inline On Tue, Aug 11, 2009 at 12:40 PM, Grant Ingersoll gsing...@apache.orgwrote: Inline... On Aug 11, 2009, at 12:44 PM, Mark Bennett wrote: I'm going somewhere with this... be patient. :-) I had asked about this briefly at the SF meetup, but there was

Re: NativeFSLockFactory, ConcurrentMergeScheduler: why locks?

2009-08-11 Thread Jason Rutherglen
Fuad, The lock indicates to external processes the index is in use, meaning it's not cause ConcurrentMergeScheduler to block. ConcurrentMergeScheduler does merge in it's own thread, however if the merges are large then they can spike IO, CPU, and cause the machine to be somewhat unresponsive.

Performance Tuning: segment_merge:index_update=5:1 (timing)

2009-08-11 Thread Fuad Efendi
In a heavily loaded Write-only Master SOLR, I have 5 minutes of RAM Buffer Flash / Segment Merge per 1 minute of (heavy) batch document updates. I am using mergeFactor=100 etc (I already posted message...) So that... I can't see hardware is a problem: with more CPU and faster RAID-0 I'll get the

Re: Performance Tuning: segment_merge:index_update=5:1 (timing)

2009-08-11 Thread Grant Ingersoll
Have you tried profiling? How often are you committing? Have you looked at Garbage Collection or any of the usual suspects like that? On Aug 11, 2009, at 4:49 PM, Fuad Efendi wrote: In a heavily loaded Write-only Master SOLR, I have 5 minutes of RAM Buffer Flash / Segment Merge per 1

RE: NativeFSLockFactory, ConcurrentMergeScheduler: why locks?

2009-08-11 Thread Fuad Efendi
Hi Jason, I am using Master/Slave (two servers); I monitored few hours today - 1 minute of document updates (about 100,000 documents) and then SOLR stops for at least 5 minutes to do background jobs like RAM flush, segment merge... Documents are small; about 10Gb of total index size for

Re: Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
One other idea I tried, which didn't work, was to see if I could get proper parsing via the stream arg: http://localhost:8983/solr/mlt?stream.body=hello+worldmlt.fl=shingle_fieldmlt.mintf=0debugQuery=true On Tue, Aug 11, 2009 at 9:09 AM, Mark Bennett mbenn...@ideaeng.com wrote: I've got an

RE: Performance Tuning: segment_merge:index_update=5:1 (timing)

2009-08-11 Thread Fuad Efendi
Never tried profiling; 3000-5000 docs per second if SOLR is not busy with segment merge; During segment merge 99% CPU, no disk swap; I can't suspect I/O... During document updates (small batches 100-1000 docs) only 5-15% CPU -server 2048Gb option of JVM (which is JRockit) + 256M for RAM Buffer

Using Lucene's payload in Solr

2009-08-11 Thread Bill Au
It looks like things have changed a bit since this subject was last brought up here. I see that there are support in Solr/Lucene for indexing payload data (DelimitedPayloadTokenFilterFactory and DelimitedPayloadTokenFilter). Overriding the Similarity class is straight forward. So the last piece

RE: Performance Tuning: segment_merge:index_update=5:1 (timing)

2009-08-11 Thread Fuad Efendi
Forgot to add: committing only once a day I tried mergeFactor=1000 and performance of index write was extremely good (more than 50,000,000 updates during part of a day) However, commit was taking 2 days or more and I simply killed process (suspecting that it can break my harddrive); I had about

DIH problem passing HTTP parameters into data-config

2009-08-11 Thread John Lowe
I've read the documentation as carefully as I can, but I must be missing something. I'm running Solr 1.3. The doc sez that I can pass my own parameters in to DIH via the HTTP request: http://wiki.apache.org/solr/DataImportHandler#head-520f8e527d9da55e8ed1e274e29709c8805c8eae What I'd

Re: NativeFSLockFactory, ConcurrentMergeScheduler: why locks?

2009-08-11 Thread Jason Rutherglen
1 minute of document updates (about 100,000 documents) and then SOLR stops 100,000 docs in a minute is a lot. Lucene is probably automatically flushing to disk and merging which is tying up the IO subsystem. You may want to set the ConcurrentMergeScheduler to 1 thread (which in Solr cannot be

Re: DIH problem passing HTTP parameters into data-config

2009-08-11 Thread John Lowe
Oops, the url attribute of the entity element in the dataConfig snippet should read: url=${dataimporter.request.feed} to match the http parameter... John

Re: Functions in search result

2009-08-11 Thread Chris Hostetter
: As far as I know, functions are executed on a per-document/field basis. : That is, I don't think any of them aggregate numeric field values from a : result set. correct. it sounds like what you are looking for is the StatsComponent... http://wiki.apache.org/solr/StatsComponent :

Indexing date into multiple fields

2009-08-11 Thread Bernadette Houghton
Am very new to SOLR, so this question may seem overly basic - In schema.xml, I have a date field type - fieldType name=date class=solr.DateField sortMissingLast=true omitNorms=true/ used by - dynamicField name=*_dt type=dateindexed=true stored=true/ dynamicField name=*_mdt

Re: Performance Tuning: segment_merge:index_update=5:1 (timing)

2009-08-11 Thread Grant Ingersoll
Is there a time of day you could schedule merges? See http://www.lucidimagination.com/search/document/bd53b0431f7eada5/concurrentmergescheduler_and_mergepolicy_question Or, you might be able to implement a scheduler that only merges the small segments, and then does the larger ones at slow