Re: Delete / filter / hide query results

2009-01-07 Thread DODMax

Please I really need some help to solve this.
Does anybody have even a little idea ?

Thank you.

-- 
View this message in context: 
http://www.nabble.com/Delete---filter---hide-query-results-tp21287332p21326944.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr query for date

2009-01-07 Thread prerna07

Hi,

 what will be the syntax of this sql query
 SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45 
 in solr format ? 

 I need to fetch records where date is between current date and 45 days from
today.

Thanks,
Prerna
-- 
View this message in context: 
http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr query for date

2009-01-07 Thread prerna07


Is this necessary to define date_field as  fieldType name=date
class=solr.DateField  in schema.xml. OR Solr query can work on text field
type ?


Akshay-8 wrote:
 
 You can use DateMath as:
 
 date_field:[NOW TO NOW+45DAYS]
 
 On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote:
 

 Hi,

  what will be the syntax of this sql query
  SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45
  in solr format ?

  I need to fetch records where date is between current date and 45 days
 from
 today.

 Thanks,
 Prerna
 --
 View this message in context:
 http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Akshay Ukey.
 
 Enjoy your job, make lots of money, work within the law. Choose any two.
 -Author Unknown.
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr-query-for-date-tp21327696p21328961.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: date range query performance

2009-01-07 Thread Erik Hatcher


On Jan 6, 2009, at 9:17 PM, Jim Adams wrote:

Can someone explain what this means to me?


The below field definition sets the timestamp field without time  
granularity, just day.  It's the difference between, say you've  
indexed a document for every millisecond in a day (what is that,  
86.4M?), and a single term for the single date.


I'm having a similar performance issue - it's an index with only 1  
million

records or so, but when trying to search on a date range it takes 30
seconds!  Yes, this date is one with hours, minutes, seconds in them  
-- do I
need to create an additional field without the time component and  
reindex
all my documents so I can get decent search performance?  Or can I  
tell Solr
Please ignore the time and do something in a reasonable  
timeframe (GRIN)


Do you care about milliseconds, seconds, minutes, or hours in terms of  
searching?  If not, it's a very good idea to reduce the granularity  
and thus the number of unique terms.


Erik





Thanks.

On Fri, Oct 31, 2008 at 10:28 PM, Michael Lackhoff mich...@lackhoff.de 
wrote:



On 01.11.2008 06:10 Erik Hatcher wrote:


Yeah, this should work fine:

   field name=timestamp type=date indexed=true stored=true
default=NOW/DAY multiValued=false/


Wow, that was fast, thanks!

-Michael





Re: Solr query for date

2009-01-07 Thread Akshay
You can use DateMath as:

date_field:[NOW TO NOW+45DAYS]

On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote:


 Hi,

  what will be the syntax of this sql query
  SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45
  in solr format ?

  I need to fetch records where date is between current date and 45 days
 from
 today.

 Thanks,
 Prerna
 --
 View this message in context:
 http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Enjoy your job, make lots of money, work within the law. Choose any two.
-Author Unknown.


Clustering Carrot2 + Solr

2009-01-07 Thread Jean-Philip EIMECKE
Hi!

I want to establish a system of clustering but i don't know how i must
realize this operation.
I have seen that the use of SOLR-769 Patch was advised but I don't know what
I must do with clustering-libs.tar and SOLR-769.patch.
Can you explain me the procedure to run clustering with Solr and Carrot?

Thanks you in advance

-- 
Jean-Philip Eimecke


Re: date range query performance

2009-01-07 Thread Erick Erickson
You'll have to search the archives for a more complete explanation, I'm
going from memory here.. (or perhaps it's on the Wiki, I don't remember).

The notion is to break apart your timestamp (if you really, really need the
precision) into several fields rather than one. I.e. index the MMDD
as one field, then perhaps HHSS a second field and, perhaps, milliseconds
as a third field. This *greatly* reduces the number of unique terms and
should
improve searching on ranges, not to mention sorting. You'll have to
manipulate the timestamp part of the query.

There are variations on the scheme, you could have 6 fields for instance,
,
MM, DD, HH, SS, MS for instance. Or

But the very best solution is to do as Erik (no relation) H. suggests, just
reindex
with, say, day granularity if that's fine enough.

Best
Erick

On Wed, Jan 7, 2009 at 6:03 AM, Erik Hatcher e...@ehatchersolutions.comwrote:


 On Jan 6, 2009, at 9:17 PM, Jim Adams wrote:

 Can someone explain what this means to me?


 The below field definition sets the timestamp field without time
 granularity, just day.  It's the difference between, say you've indexed a
 document for every millisecond in a day (what is that, 86.4M?), and a single
 term for the single date.

  I'm having a similar performance issue - it's an index with only 1 million
 records or so, but when trying to search on a date range it takes 30
 seconds!  Yes, this date is one with hours, minutes, seconds in them -- do
 I
 need to create an additional field without the time component and reindex
 all my documents so I can get decent search performance?  Or can I tell
 Solr
 Please ignore the time and do something in a reasonable timeframe (GRIN)


 Do you care about milliseconds, seconds, minutes, or hours in terms of
 searching?  If not, it's a very good idea to reduce the granularity and thus
 the number of unique terms.

Erik





 Thanks.

 On Fri, Oct 31, 2008 at 10:28 PM, Michael Lackhoff mich...@lackhoff.de
 wrote:

  On 01.11.2008 06:10 Erik Hatcher wrote:

  Yeah, this should work fine:

   field name=timestamp type=date indexed=true stored=true
 default=NOW/DAY multiValued=false/


 Wow, that was fast, thanks!

 -Michael





Is there any better way to configure db-data-config.xml

2009-01-07 Thread Manupriya

Hi,

I am using the following schema - 

http://www.nabble.com/file/p21332196/table_stuct.gif 

1. INSTITUTION table is the main table that has information about all the
institutions.
2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for
each 'institute_type_id' in the INSTITUTION table.
3. INSTITUTION_SOURCE_MAP table is a mapping table. This has institution_id
corresponding to source_id from external system.

NOTE - INSTITUTION table is union of institutions created internally AND
institutions corresponding to source_ids from external systems.

Requirement - 
1. Search Institutions by 'institution_name' in the INSTITUTION table.
2. Display institution_type for institution_type_id.
3. user should be able to search for institution by 'source_id' and
'source_entity_name'.

My db-data-config.xml is following - 

===
dataConfig
dataSource driver=net.sourceforge.jtds.jdbc.Driver
url=jdbc:jtds:sqlserver://localhost:1433/dummy-master user=dummy-master
password=dummy-master /
document name=institution
 entity name=INSTITUTION pk=institution_id query=select * from
INSTITUTION  
 deltaQuery=select institution_id from INSTITUTION where
last_update_date  '${dataimporter.last_index_time}'
field column=institution_id name=id /
field column=institution_name name=institutionName /
field column=description name=description /
field column=institution_type_id name=institutionTypeId /
  
entity name=INSTITUTION_TYPE pk=institution_type_id 
query=select institution_type from INSTITUTION_TYPE where 
institution_type_id='${INSTITUTION.institution_type_id}'
parentDeltaQuery=select institution_type_id from INSTITUTION where
institution_type_id=${INSTITUTION_TYPE.institution_type_id}

field name=institutionType column=institution_type /
  /entity
/entity

entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id,
source_entity_name, source_key, source_key_field query=select * from
INSTITUTION_SOURCE_MAP
  field column=source_id name=sourceId /
  field column=source_entity_name name=sourceEntityName /

 entity name=INSTITUTION pk=institution_id
 query=select * from INSTITUTION where institution_id =
'${INSTITUTION_SOURCE_MAP.institution_id}'
   field column=institution_id name=id /
  field column=institution_name name=institutionName /
  field column=description name=description /
  field column=institution_type_id name=institutionTypeId /
   
entity name=INSTITUTION_TYPE pk=institution_type_id query=select
institution_type from INSTITUTION_TYPE where
institution_type_id='${INSTITUTION.institution_type_id}'  
parentDeltaQuery=select institution_type_id from INSTITUTION where
institution_type_id=${INSTITUTION_TYPE.institution_type_id}

 field name=institutionType column=institution_type /
  /entity
/entity

/entity
/document
/dataConfig
===

My configuration file is working perfectly fine. I have specified two
entity inside on document.  And both the entity has further nested
entity tags.

Can anyone suggest me if there is any other/better way to configure the
relationship? :confused:

I have referred http://wiki.apache.org/solr/DataImportHandler and
http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf

Is there any resource that has detailed information about tags used in
db-data-config.xml?

Thanks,
Manu

-- 
View this message in context: 
http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Partitioning the index

2009-01-07 Thread Jim Adams
that is what I thought.  Thanks.

On Tue, Jan 6, 2009 at 10:18 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 You'll need to re-index.

 On Wed, Jan 7, 2009 at 9:49 AM, Jim Adams jasolru...@gmail.com wrote:

  It's a range query.  I don't have any faceted data.
 
  Can I limit the precision of the existing field, or must I re-index?
 
  Thanks.
 
  On Tue, Jan 6, 2009 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote:
 
   On Tue, Jan 6, 2009 at 10:06 PM, Jim Adams jasolru...@gmail.com
 wrote:
Are there any particular suggestions on memory size for a machine?  I
   have a
box that has only 1 million records on it - yet I'm finding that date
searches are already unacceptable (30 seconds) slow.  Other searches
  seem
okay though.
  
   I assume this is a date  range query (or date faceting)?
   Range queries with many unique terms in the range is a known
   limitation, and we should hopefully fix this in 1.4.
   In the meantime, limiting the precision of dates could help a great
 deal.
  
   -Yonik
  
 



 --
 Regards,
 Shalin Shekhar Mangar.



encountered the Cannot allocate memory when calling snapshooter program after optimize command

2009-01-07 Thread Justin Yao

Hi,

I configured solr to listen on postOptimize event and call the 
snapshooter program after an optimize command. It works well when the 
Java heap size is set to less than 4G. But if I increased the java heap 
size to 5G, the snapshooter program can't be successfully called after 
the optimize command and error message is here:


SEVERE: java.io.IOException: Cannot run program 
/home/solr_1.3/solr/bin/snapshooter (in directory 
/home/solr_1.3/solr/bin): java.io.IOException: error=12, Cannot 
allocate memory

at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at java.lang.Runtime.exec(Runtime.java:593)

Here is my server platform:

OS: CentOS 5.2 x86_64
Memory: 8G
Solr: 1.3


Any suggestion is appreciated.

Thanks,
Justin


Solr expert(s) needed

2009-01-07 Thread Tony Wang
I would like to build a search engine that indexes online videos from such
websites as metacafe, youtube, etc. I want to use the best open-source Solr
as the indexing tool with Nutch as web crawler. However, I have difficulties
integrating these two open source products. So I am seeking for help and I
will compensate you for the time spent. Those people who are interested,
please send me your hourly rate and the estimated hours needed to get this
done.

My email is: ivytony [at] gmail dot com

Thanks!

Tony

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: Plans for 1.3.1?

2009-01-07 Thread Ryan McKinley
there are plans for a regular release (1.4) later this month.  No  
plans for bug fix release.


If there are critical bugs there would be a bug fix release, but not  
for minor ones.



On Jan 7, 2009, at 11:06 AM, Jerome L Quinn wrote:



Hi, all.  Are there any plans for putting together a bugfix  
release?  I'm
not looking for particular bugs, but would like to know if bug fixes  
are

only going to be done mixed in with new features.

Thanks,
Jerry Quinn




Re: Plans for 1.3.1?

2009-01-07 Thread William Pierce
That is fantastic!  Will the Java replication support be included in this 
release?


Thanks,

- Bill

--
From: Ryan McKinley ryan...@gmail.com
Sent: Wednesday, January 07, 2009 11:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Plans for 1.3.1?

there are plans for a regular release (1.4) later this month.  No  plans 
for bug fix release.


If there are critical bugs there would be a bug fix release, but not  for 
minor ones.



On Jan 7, 2009, at 11:06 AM, Jerome L Quinn wrote:



Hi, all.  Are there any plans for putting together a bugfix  release? 
I'm

not looking for particular bugs, but would like to know if bug fixes  are
only going to be done mixed in with new features.

Thanks,
Jerry Quinn





Re: encountered the Cannot allocate memory when calling snapshooter program after optimize command

2009-01-07 Thread Otis Gospodnetic
Justin,

Please check solr-user archive on markmail.org and search for overcommit or 
even Whitman to find a recent thread that I believe has the answer to this 
question.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Justin Yao jus...@snooth.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, January 7, 2009 11:16:05 AM
 Subject: encountered the Cannot allocate memory when calling snapshooter 
 program after optimize command
 
 Hi,
 
 I configured solr to listen on postOptimize event and call the snapshooter 
 program after an optimize command. It works well when the Java heap size is 
 set 
 to less than 4G. But if I increased the java heap size to 5G, the snapshooter 
 program can't be successfully called after the optimize command and error 
 message is here:
 
 SEVERE: java.io.IOException: Cannot run program 
 /home/solr_1.3/solr/bin/snapshooter (in directory 
 /home/solr_1.3/solr/bin): 
 java.io.IOException: error=12, Cannot allocate memory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
 at java.lang.Runtime.exec(Runtime.java:593)
 
 Here is my server platform:
 
 OS: CentOS 5.2 x86_64
 Memory: 8G
 Solr: 1.3
 
 
 Any suggestion is appreciated.
 
 Thanks,
 Justin



Re: Clustering Carrot2 + Solr

2009-01-07 Thread Otis Gospodnetic
Hi,

Most likely (didn't look at SOLR-769) you need to:
1) apply the patch
2) untar the .tar file and copy the jars from it to solr home's lib/ dir


But the patch may be outdated and may not apply cleanly.

Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Jean-Philip EIMECKE jpeime...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, January 7, 2009 6:46:18 AM
 Subject: Clustering Carrot2 + Solr
 
 Hi!
 
 I want to establish a system of clustering but i don't know how i must
 realize this operation.
 I have seen that the use of SOLR-769 Patch was advised but I don't know what
 I must do with clustering-libs.tar and SOLR-769.patch.
 Can you explain me the procedure to run clustering with Solr and Carrot?
 
 Thanks you in advance
 
 -- 
 Jean-Philip Eimecke



Re: cannot allocate memory for snapshooter

2009-01-07 Thread Mark Miller

Brian Whitman wrote:

On Sun, Jan 4, 2009 at 9:47 PM, Mark Miller markrmil...@gmail.com wrote:

  

Hey Brian, I didn't catch what OS you are using on EC2 by the way. I
thought most UNIX OS's were using memory overcommit - A quick search brings
up Linux, AIX, and HP-UX, and maybe even OSX?

What are you running over there? EC2, so Linux I assume?




This is on debian, a 2.6.21 x86_64 kernel
Interesting. Well it must not be overcommitting then. I think thats your 
only hope without a serious pain in the butt.


Check your settings - you should be able to do it through proc. Here is 
a bit of info cut and pasted from the web. I am *guessing* that you are 
in mode 0, and the Heuristic is worried you are going to try and use 
that RAM. Perhaps try 1:


echo 1  /proc/sys/vm/overcommit_memory

   0 - Heuristic overcommit handling. Obvious overcommits of address 
space are refused. Used for a typical system. It ensures a seriously 
wild allocation fails while allowing overcommit to reduce swap usage. 
root is allowed to allocate slighly more memory in this mode. This is 
the default.

   1 - Always overcommit.
   2 - Don't overcommit. The total address space commit for the system 
is not permitted to exceed swap plus a configurable percentage (default 
is 50) of physical RAM. Depending on the percentage you use, in most 
situations this means a process will not be killed while attempting to 
use already-allocated memory but will receive errors on memory 
allocation as appropriate.


Re: Plans for 1.3.1?

2009-01-07 Thread Ryan McKinley

yes.  check:
http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt

for stuff that is already in 1.4 and
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truemode=hidesorter/order=DESCsorter/field=priorityresolution=-1pid=12310230fixfor=12313351

for stuff that may be in the release..

ryan


On Jan 7, 2009, at 12:14 PM, William Pierce wrote:

That is fantastic!  Will the Java replication support be included in  
this release?


Thanks,

- Bill

--
From: Ryan McKinley ryan...@gmail.com
Sent: Wednesday, January 07, 2009 11:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Plans for 1.3.1?

there are plans for a regular release (1.4) later this month.  No   
plans for bug fix release.


If there are critical bugs there would be a bug fix release, but  
not  for minor ones.



On Jan 7, 2009, at 11:06 AM, Jerome L Quinn wrote:



Hi, all.  Are there any plans for putting together a bugfix   
release? I'm
not looking for particular bugs, but would like to know if bug  
fixes  are

only going to be done mixed in with new features.

Thanks,
Jerry Quinn






Re: debugging long commits

2009-01-07 Thread Mike Klaas


Hi Brian,

You might want to follow up on the Lucene list (java-u...@lucene.apache.org 
).  Something was causing problems with the merging and thus you ended  
up with too many segments (hence the slow commits).  I doubt that you  
lost anything--usually the merge function doesn't modify the index  
until the merge is complete.  But I am not familiar enough with this  
code in lucene to be sure.


-Mike

On 2-Jan-09, at 10:17 AM, Brian Whitman wrote:


I think I'm getting close with this (sorry for the self-replies)

I tried an optimize (which we never do) and it took 30m and said  
this a lot:


Exception in thread Lucene Merge Thread #4
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.ArrayIndexOutOfBoundsException: Array index out of range:  
34950

at
org 
.apache 
.lucene 
.index 
.ConcurrentMergeScheduler 
.handleMergeException(ConcurrentMergeScheduler.java:314)

at
org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:291)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out  
of

range: 34950
at org.apache.lucene.util.BitVector.get(BitVector.java:91)
at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java: 
125)

at
org 
.apache 
.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)

at
org 
.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java: 
633)

at
org 
.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java: 
585)

at
org 
.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java: 
546)
at  
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java: 
499)

at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 
4291)

at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3932)
at
org 
.apache 
.lucene 
.index 
.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)

at
org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:260)

Jan 2, 2009 6:05:49 PM org.apache.solr.common.SolrException log
SEVERE: java.io.IOException: background merge hit exception:  
_ks4:C2504982
_oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201  
_1agw:C1717926

into _1agy [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
at
org 
.apache 
.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java: 
355)

at
org 
.apache 
.solr 
.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcesso

...


But then it finished. And now commits are OK again.

Anyone know what the merge hit exception means and if i lost anything?




Re: Subscribe Me

2009-01-07 Thread Mike Klaas

Kalidoss,

You can subscribe here:
http://lucene.apache.org/solr/mailing_lists.html

regards,
-Mike

On 5-Jan-09, at 4:19 AM, kalidoss wrote:



Thanks,
kalidoss.m,


** DISCLAIMER **
Information contained and transmitted by this E-MAIL is proprietary  
to Sify Limited and is intended for use only by the individual or  
entity to which it is addressed, and may contain information that is  
privileged, confidential or exempt from disclosure under applicable  
law. If this is a forwarded message, the content of this E-MAIL may  
not have been sent with the authority of the Company. If you are not  
the intended recipient, an agent of the intended recipient or a   
person responsible for delivering the information to the named  
recipient,  you are notified that any use, distribution,  
transmission, printing, copying or dissemination of this information  
in any way or in any manner is strictly prohibited. If you have  
received this communication in error, please delete this mail   
notify us immediately at ad...@sifycorp.com




Re: Plans for 1.3.1?

2009-01-07 Thread William Pierce

Thanks, Ryan!

It is great that Solr replication (SOLR-561) is included in this release. 
One thing I want to confirm (if Noble, Shalin et al) can help:


I had encountered an issue a while back (in late October I believe) with 
using SOLR-561.  I was getting an error (AlreadyClosedException) from the 
slave code which caused the replication to fail.  I was wondering if this 
had been fixed.


Mark Miller had helped diagnose the problem and suggested a source code 
change.


http://www.nabble.com/forum/ViewPost.jtp?post=20505307framed=y

Thanks,

- Bill




Re: Setting up DataImportHandler for Oracle datasource on JBoss

2009-01-07 Thread The Flight Captain

Thanks Paul, that fixed the problem.


the root node is not dataconf
it should be
dataConfig

-- 
View this message in context: 
http://www.nabble.com/Setting-up-DataImportHandler-for-Oracle-datasource-on-JBoss-tp21305824p21342413.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plans for 1.3.1?

2009-01-07 Thread Mark Miller

William Pierce wrote:

Thanks, Ryan!

It is great that Solr replication (SOLR-561) is included in this 
release. One thing I want to confirm (if Noble, Shalin et al) can help:


I had encountered an issue a while back (in late October I believe) 
with using SOLR-561.  I was getting an error (AlreadyClosedException) 
from the slave code which caused the replication to fail.  I was 
wondering if this had been fixed.


Mark Miller had helped diagnose the problem and suggested a source 
code change.


http://www.nabble.com/forum/ViewPost.jtp?post=20505307framed=y

Thanks,

- Bill


Hey Bill, Ill update you on this. It was a bug in Lucene that is 
sidestepped in solr 1.4 (as I mentioned in the original thread, a patch 
switched to using Lucene methods that don't tickle the bug), so it will 
be fixed in 1.4. Also though, the original bug was fixed in Lucene, and 
solr 1.4 will contain a version of Lucene with that fix, so we should be 
doubly fixed here ;)


- Mark


Re: Plans for 1.3.1?

2009-01-07 Thread William Pierce

Hi, Mark:

Thanks for the updateLooking forward to 1.4!

Cheers,

- Bill

--
From: Mark Miller markrmil...@gmail.com
Sent: Wednesday, January 07, 2009 4:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Plans for 1.3.1?


William Pierce wrote:

Thanks, Ryan!

It is great that Solr replication (SOLR-561) is included in this 
release. One thing I want to confirm (if Noble, Shalin et al) can help:


I had encountered an issue a while back (in late October I believe) 
with using SOLR-561.  I was getting an error (AlreadyClosedException) 
from the slave code which caused the replication to fail.  I was 
wondering if this had been fixed.


Mark Miller had helped diagnose the problem and suggested a source 
code change.


http://www.nabble.com/forum/ViewPost.jtp?post=20505307framed=y

Thanks,

- Bill


Hey Bill, Ill update you on this. It was a bug in Lucene that is 
sidestepped in solr 1.4 (as I mentioned in the original thread, a patch 
switched to using Lucene methods that don't tickle the bug), so it will 
be fixed in 1.4. Also though, the original bug was fixed in Lucene, and 
solr 1.4 will contain a version of Lucene with that fix, so we should be 
doubly fixed here ;)


- Mark



Re: Using Solr with an existing Lucene index

2009-01-07 Thread The Flight Captain

My first attempt to do this resulted in my Java program throwing a
CorruptIndex exception. It appears as though Solr has somehow modified my
index files in some way which causes the Lucene code to see them as corrupt
(even though I did not, at least intentionally, try to post any documents or
otherwise update the index through Lucene).

Did you manage to retrieve any data from your existing datasource through
Solr using the existing index?

If so, how? Is is just a matter of changing your data directory to your
existing index data in the solrconfig.xml, for example:
  dataDir/my/existing/lucene/index/data/dataDir ?
-- 
View this message in context: 
http://www.nabble.com/Using-Solr-with-an-existing-Lucene-index-tp20002395p21344616.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Clustering Carrot2 + Solr

2009-01-07 Thread Grant Ingersoll

Hi Jean-Philip,

The patch should be standalone in that it creates an area under  
contrib, but it may not be completely up to date, since there have  
been some minor tweaks to the ANT builds for contrib since I wrote the  
clustering stuff.   However, it should still work once you get past  
that.


So, to get it working, apply the patch, and then put the clustering- 
libs.tar into the contrib/clustering/lib directory (I think the dir is  
called clustering).


If I recall, you should be able to do a ant example from the top level  
and it should add into the WAR, etc. (although this will be changed  
when I get a chance to update the patch).


From there, have a look at http://wiki.apache.org/solr/ClusteringComponent

I may have a moment or two tomorrow, in which case I can look at any  
specific issues you might have.


Cheers,
Grant


On Jan 7, 2009, at 6:46 AM, Jean-Philip EIMECKE wrote:


Hi!

I want to establish a system of clustering but i don't know how i must
realize this operation.
I have seen that the use of SOLR-769 Patch was advised but I don't  
know what

I must do with clustering-libs.tar and SOLR-769.patch.
Can you explain me the procedure to run clustering with Solr and  
Carrot?


Thanks you in advance

--
Jean-Philip Eimecke






Amount range and facet fields returns [facet_fields]

2009-01-07 Thread Yevgeniy Belman
Hi,

I am curious if this is expected behavior in a typical facet with range
query:
params={facet=true,facet.query=[price:[* TO 500], price:[500 TO
*]],q=*:*,facet.field=price

i am getting back not only the:
facet_queries={price:[* TO 500]=2,price:[500 TO *]=3}

but also:
facet_fields={price={150.99=1,199.99=1,699.99=1,930.0=1,2300.0=1}

I was searching for a way to create the elusive dynamic amount ranges, and
this would allow me to do it on the client side for sure, but is this right?
Solr will send all of this data back? It consists of every price variant
with a count. In my case 5 prices were all unique. If real data were used
could this be too big to send across?

SimpleFacet.getFacetCounts() populates the response object with it. Is this
an appropriate place to calculate my dymic amount ranges and replace
facet_fields:
res.add(facet_queries, getFacetQueryCounts());
res.add(facet_fields, getFacetFieldCounts()); // -- replace with
Dynamic facets
res.add(facet_dates, getFacetDateCounts());

or should it be done sooner, somewhere in getFieldCacheCounts() where the
itteration through all the 5 docs is happening?

Thanks,
Yev


Re: Is there any better way to configure db-data-config.xml

2009-01-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
why do you have multiple root entities ?



On Wed, Jan 7, 2009 at 7:48 PM, Manupriya manupriya.si...@gmail.com wrote:

 Hi,

 I am using the following schema -

 http://www.nabble.com/file/p21332196/table_stuct.gif

 1. INSTITUTION table is the main table that has information about all the
 institutions.
 2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for
 each 'institute_type_id' in the INSTITUTION table.
 3. INSTITUTION_SOURCE_MAP table is a mapping table. This has institution_id
 corresponding to source_id from external system.

 NOTE - INSTITUTION table is union of institutions created internally AND
 institutions corresponding to source_ids from external systems.

 Requirement -
 1. Search Institutions by 'institution_name' in the INSTITUTION table.
 2. Display institution_type for institution_type_id.
 3. user should be able to search for institution by 'source_id' and
 'source_entity_name'.

 My db-data-config.xml is following -

 ===
 dataConfig
dataSource driver=net.sourceforge.jtds.jdbc.Driver
 url=jdbc:jtds:sqlserver://localhost:1433/dummy-master user=dummy-master
 password=dummy-master /
document name=institution
 entity name=INSTITUTION pk=institution_id query=select * from
 INSTITUTION
 deltaQuery=select institution_id from INSTITUTION where
 last_update_date  '${dataimporter.last_index_time}'
field column=institution_id name=id /
field column=institution_name name=institutionName /
field column=description name=description /
field column=institution_type_id name=institutionTypeId /

entity name=INSTITUTION_TYPE pk=institution_type_id
 query=select institution_type from INSTITUTION_TYPE where
 institution_type_id='${INSTITUTION.institution_type_id}'
 parentDeltaQuery=select institution_type_id from INSTITUTION where
 institution_type_id=${INSTITUTION_TYPE.institution_type_id}

field name=institutionType column=institution_type /
  /entity
 /entity

 entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id,
 source_entity_name, source_key, source_key_field query=select * from
 INSTITUTION_SOURCE_MAP
  field column=source_id name=sourceId /
  field column=source_entity_name name=sourceEntityName /

  entity name=INSTITUTION pk=institution_id
  query=select * from INSTITUTION where institution_id =
 '${INSTITUTION_SOURCE_MAP.institution_id}'
   field column=institution_id name=id /
  field column=institution_name name=institutionName /
  field column=description name=description /
  field column=institution_type_id name=institutionTypeId /

 entity name=INSTITUTION_TYPE pk=institution_type_id query=select
 institution_type from INSTITUTION_TYPE where
 institution_type_id='${INSTITUTION.institution_type_id}'
 parentDeltaQuery=select institution_type_id from INSTITUTION where
 institution_type_id=${INSTITUTION_TYPE.institution_type_id}

 field name=institutionType column=institution_type /
  /entity
 /entity

 /entity
 /document
 /dataConfig
 ===

 My configuration file is working perfectly fine. I have specified two
 entity inside on document.  And both the entity has further nested
 entity tags.

 Can anyone suggest me if there is any other/better way to configure the
 relationship? :confused:

 I have referred http://wiki.apache.org/solr/DataImportHandler and
 http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf

 Is there any resource that has detailed information about tags used in
 db-data-config.xml?

 Thanks,
 Manu

 --
 View this message in context: 
 http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: Amount range and facet fields returns [facet_fields]

2009-01-07 Thread Shalin Shekhar Mangar
On Thu, Jan 8, 2009 at 9:49 AM, Yevgeniy Belman ysbel...@gmail.com wrote:

 Hi,

 I am curious if this is expected behavior in a typical facet with range
 query:
 params={facet=true,facet.query=[price:[* TO 500], price:[500 TO
 *]],q=*:*,facet.field=price

 i am getting back not only the:
 facet_queries={price:[* TO 500]=2,price:[500 TO *]=3}

 but also:
 facet_fields={price={150.99=1,199.99=1,699.99=1,930.0=1,2300.0=1}


You are adding both facet.query=price[... and the facet.field=price
parameters so the response contains results for both.

I was searching for a way to create the elusive dynamic amount ranges, and
 this would allow me to do it on the client side for sure, but is this
 right?
 Solr will send all of this data back? It consists of every price variant
 with a count. In my case 5 prices were all unique. If real data were used
 could this be too big to send across?

 SimpleFacet.getFacetCounts() populates the response object with it. Is this
 an appropriate place to calculate my dymic amount ranges and replace
 facet_fields:
 res.add(facet_queries, getFacetQueryCounts());
 res.add(facet_fields, getFacetFieldCounts()); // -- replace with
 Dynamic facets
 res.add(facet_dates, getFacetDateCounts());

 or should it be done sooner, somewhere in getFieldCacheCounts() where the
 itteration through all the 5 docs is happening?


By dynamic ranges, do you mean that you want min and max values? If yes, you
can look at StatsComponent (it is a 1.4 feature, so you'll need to use the
nightly builds).

http://wiki.apache.org/solr/StatsComponent

-- 
Regards,
Shalin Shekhar Mangar.


Re: Using Lucene index in Solr

2009-01-07 Thread The Flight Captain

Do I have to set the datasource that my index references?

My data is stored in a database, I want Solr to look up the data in that
database using my existing index. At the moment, I have set the dataDir
element in my solrconfig to point at my existing index, and checked the
schema on my existing index using Luke but I can't get any results when
searching in Solr.

My index was created using hibernate-search. 

How I can retrieve my data in Solr, using the existing Lucene index? I think
I need to set the database connection details somewhere, just not sure
where. I have set up a dataImport handler, but I don't want that to
overwrite my exising index.


yonik wrote:
 
 On 6/21/06, Tricia Williams pgwil...@student.cs.uwaterloo.ca wrote:
I was wondering if there are any major differences in building an
 index
 using Lucene and Solr.  If there is no substantial differences, how would
 one
 go about using an existing index created using Lucene in Solr?
 
 You can definitely do that for the majority of indicies w/o writing
 any code... you just need to make sure the schema matches what is in
 the index (make the analyzers for the field types compatible, etc).
 
 If you have access to the source code that built the index, start
 there.  If you don't then open up the index with Luke to see what you
 can find out.
 
 -Yonik
 
 

-- 
View this message in context: 
http://www.nabble.com/Using-Lucene-index-in-Solr-tp4983079p21346212.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Is there any better way to configure db-data-config.xml

2009-01-07 Thread Manupriya

Hi Noble,

In my case, Institutions can be entered in two different ways.

1. Institutions information is directly present in the INSTITUTION table. 
(Note - Such institutions ARE NOT present in INSTITUTION_SOURCE_MAP table.).

In this case, I have INSTITUTION as the parent entity. And INSTITUTION_TYPE
as child entry in order to retrieve the institution_type.

2. Institutions are mapped to INSTITUTION table through
INSTITUTION_SOURCE_MAP table. (In this case, user can search institutions
based on INSTITUTION_SOURCE_MAP fields.) 

So here INSTITUTION_SOURCE_MAP table is parent entity. And INSTITUTION is
the child entity.

How else can I specify the relationship?

Thanks,
Manu


Noble Paul നോബിള്‍ नोब्ळ् wrote:
 
 why do you have multiple root entities ?
 
 
 
 On Wed, Jan 7, 2009 at 7:48 PM, Manupriya manupriya.si...@gmail.com
 wrote:

 Hi,

 I am using the following schema -

 http://www.nabble.com/file/p21332196/table_stuct.gif

 1. INSTITUTION table is the main table that has information about all the
 institutions.
 2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for
 each 'institute_type_id' in the INSTITUTION table.
 3. INSTITUTION_SOURCE_MAP table is a mapping table. This has
 institution_id
 corresponding to source_id from external system.

 NOTE - INSTITUTION table is union of institutions created internally AND
 institutions corresponding to source_ids from external systems.

 Requirement -
 1. Search Institutions by 'institution_name' in the INSTITUTION table.
 2. Display institution_type for institution_type_id.
 3. user should be able to search for institution by 'source_id' and
 'source_entity_name'.

 My db-data-config.xml is following -

 ===
 dataConfig
dataSource driver=net.sourceforge.jtds.jdbc.Driver
 url=jdbc:jtds:sqlserver://localhost:1433/dummy-master
 user=dummy-master
 password=dummy-master /
document name=institution
 entity name=INSTITUTION pk=institution_id query=select *
 from
 INSTITUTION
 deltaQuery=select institution_id from INSTITUTION where
 last_update_date  '${dataimporter.last_index_time}'
field column=institution_id name=id /
field column=institution_name name=institutionName /
field column=description name=description /
field column=institution_type_id name=institutionTypeId /

entity name=INSTITUTION_TYPE pk=institution_type_id
 query=select institution_type from INSTITUTION_TYPE where
 institution_type_id='${INSTITUTION.institution_type_id}'
 parentDeltaQuery=select institution_type_id from INSTITUTION where
 institution_type_id=${INSTITUTION_TYPE.institution_type_id}

field name=institutionType column=institution_type /
  /entity
 /entity

 entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id,
 source_entity_name, source_key, source_key_field query=select * from
 INSTITUTION_SOURCE_MAP
  field column=source_id name=sourceId /
  field column=source_entity_name name=sourceEntityName /

  entity name=INSTITUTION pk=institution_id
  query=select * from INSTITUTION where institution_id =
 '${INSTITUTION_SOURCE_MAP.institution_id}'
   field column=institution_id name=id /
  field column=institution_name name=institutionName /
  field column=description name=description /
  field column=institution_type_id name=institutionTypeId /

 entity name=INSTITUTION_TYPE pk=institution_type_id query=select
 institution_type from INSTITUTION_TYPE where
 institution_type_id='${INSTITUTION.institution_type_id}'
 parentDeltaQuery=select institution_type_id from INSTITUTION where
 institution_type_id=${INSTITUTION_TYPE.institution_type_id}

 field name=institutionType column=institution_type /
  /entity
 /entity

 /entity
 /document
 /dataConfig
 ===

 My configuration file is working perfectly fine. I have specified two
 entity inside on document.  And both the entity has further nested
 entity tags.

 Can anyone suggest me if there is any other/better way to configure the
 relationship? :confused:

 I have referred http://wiki.apache.org/solr/DataImportHandler and
 http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf

 Is there any resource that has detailed information about tags used in
 db-data-config.xml?

 Thanks,
 Manu

 --
 View this message in context:
 http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21346513.html
Sent from the Solr - User mailing list archive at Nabble.com.