Re: commit / new searcher delay?

2009-03-05 Thread Chris Hostetter
: I suspect this has something to do with waiting for the searcher to
: warm and switch over (?).  Though, I'm confused because when I print
: out /solr/admin/registry.jsp, the hashcode of the Searcher changes
: immediately (as the commit docs say, the commit operation blocks by
: default until a new searcher is in place).  I've tried turning off all
: caching, to no effect.

off the top of my head i don't remember if registry.jsp will start to list 
hte new searcher even if it's not swapped in to become hte current 
searcher.

what you should check, is when the slave logs hte end of the commit, and 
how long that is after the start of the commit. (and of course: if it logs 
any warming stats -- you might have overlooked a cache)

: Anyone have any idea what could be going on here?  Ideally, 
: would be an operation that blocks until the exact moment when the new
: searcher is in place and is actually serving based on the new index

that's how it works if you use waitSearcher=true ... are you sure you're 
not seeing the effects of some caching (HTTP?) in between solr and your 
client?  did you try setting never304=true in the  section of 
solrconfig.xml ?



-Hoss



Re: supported document types

2009-03-05 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 10:13 AM, Ashish P  wrote:

>
> What are the types of documents types ( Mime types ) supported for indexing
> and searching in solr.


Solr tries to be agnostic about how the content being indexed is being
created. For solr everything is a string (or number). It is upto you to
use/write an indexer which can extract data out of your mime types and send
them to solr.

Some good places to start are:
http://wiki.apache.org/solr/DataImportHandler (supports getting data from
DB/XML/HTTP)
http://wiki.apache.org/solr/ExtractingRequestHandler (supports office
documents, pdf and many others)

Hope that helps.
-- 
Regards,
Shalin Shekhar Mangar.


Re: index multi valued field into multiple fields

2009-03-05 Thread Ashish P

hmm. I think I will just do that. 
Thanks for clearing my doubt...
-Ashish


Shalin Shekhar Mangar wrote:
> 
> On Fri, Mar 6, 2009 at 10:53 AM, Ashish P 
> wrote:
> 
>>
>> OK. so basically what you are saying is when you use copyField, it will
>> copy
>> the whole data from one field to other many fields but it can not copy
>> part
>> of data to other field.
> 
> 
> Yes, it will try to copy all the data.
> 
> 
>>
>> Because within same tokenizing ( when I am tokenizing "condition" field )
>> I
>> want part of data to go into content field and part of data to go into
>> tsdatetime field. But that looks like not possible.
>> The field "condition" is actually mix of multiple data values.
>>
> 
> Why not send both values to different fields at index time explicitly?
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/index-multi-valued-field-into-multiple-fields-tp22364915p22366918.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Custom Field Type

2009-03-05 Thread Chris Hostetter
: But actually, something like that only works for text field types that
: you can specify an analyzer for.  To sort by the integer value, you
: need an integer field.

we should really fix that so any FieldType can have an analyzer and treat 
Tokens produced just like multivalued fields are right now.

(I remember breifly looking into it a while back and got frustrated 
because i couldn't figure out where in the code the limitation was.)




-Hoss



Re: Search schema using q Query

2009-03-05 Thread Chris Hostetter

: 1. I am trying to customize the search with q query parameter, so that it
: can support wildcard and field boosting. I customize QueryParser and created
: the wildcard query in the same way as it does for non wildcard. But even
: with this changed query, the results are not showing up. 
: 
: I figured out as how it is doing the field boosting with the scores. But
: what I want to know is how it is fetching the records from the indexes based
: on the query.
: 
: Please suggest how I should go forward.

I don't see how anyone could possibly make any suggestions about 
where you should go from here, since you haven't provided any detials 
about what it is you've done --- you said you've customized the 
QueryParser, but without giving us any idea of *how* you've customized the 
query parser, no one on this list is in any sort of position to guess what 
your problem might be.

there are people on this list who clearly want to help you, but you have 
to help them by giving them *all* of the information relevant to your 
problem.  

What *exactly* did you change?  what does your code look like?  what do 
you configs look like? what data have you indexed? what do your query URLs 
look like? what results do you get? what results do you expect to get?



-Hoss



Re: index multi valued field into multiple fields

2009-03-05 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 10:53 AM, Ashish P  wrote:

>
> OK. so basically what you are saying is when you use copyField, it will
> copy
> the whole data from one field to other many fields but it can not copy part
> of data to other field.


Yes, it will try to copy all the data.


>
> Because within same tokenizing ( when I am tokenizing "condition" field ) I
> want part of data to go into content field and part of data to go into
> tsdatetime field. But that looks like not possible.
> The field "condition" is actually mix of multiple data values.
>

Why not send both values to different fields at index time explicitly?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Distributed Search in multcore scenario.

2009-03-05 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 11:04 AM, Sagar Khetkade
wrote:

>
> I have multi-core scenario where the schemas are different and I have to
> search for these cores as per the use case. I am using distributed search
> approach here for getting the search results for the query from these cores.


Distributed search does not work with different schemas. It works when all
shards have same schema but different data. Note that it need not be the
same schema but the fields being used for querying/fetching should be
present on all shards.


>
> But there is an obstacle. I have used EmbbededSolrServer class of Solrj
>  for getting the solr server. As I have to pass the shardes  in the from of
> URL to the CommonHttpServer which I am unable to pass( as I am not using the
> HTTP approach). Instead I have tried shards as the server instance for the
> multi-core which I giving me error as  “Invalid uri 'http://core0/select': 
> Invalid authority”.
> Can anybody suggest me the right approach having distributed search using
> multi-core scenario as well as the EmbedderSolrServer.
>

Distributed search uses HTTP to communicate with other shards. Therefore,
you need some way to expose solr as a URL. You'd need to build some sort of
an http layer on top of embedded solr server.

I wouldn't use EmbeddedSolrServer for distributed search. I suggest that you
use a proper solr installation.

-- 
Regards,
Shalin Shekhar Mangar.


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Tony Wang
Hi Hoss,

But I cannot find documents about the integration of Nutch and Solr in
anywhere. Could you give me some clue? thanks

Tony

On Thu, Mar 5, 2009 at 11:14 PM, Chris Hostetter
wrote:

>
> : with Solr. What crawler do you guys use? or you coded one by yourself? I
>
> neither -- i've never indexed "crawled" data With Solr, i only ever index
> structured data in one form or another.
>
> (the closest i've ever come to using a crawler with Solr is some ant tasks
> i whiped up one day to recursively apply an XSLT to some XML data and then
> index the content because i needed a quick hack to build the index 
> before we had DIH)
>
> If i needed to build an index of crawled data, i'd use Nutch.
>
>
> -Hoss
>
>


-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Chris Hostetter

: with Solr. What crawler do you guys use? or you coded one by yourself? I

neither -- i've never indexed "crawled" data With Solr, i only ever index 
structured data in one form or another.

(the closest i've ever come to using a crawler with Solr is some ant tasks 
i whiped up one day to recursively apply an XSLT to some XML data and then 
index the content because i needed a quick hack to build the index  
before we had DIH)

If i needed to build an index of crawled data, i'd use Nutch.


-Hoss



Re: Column Query with q query Parameter

2009-03-05 Thread dabboo

Erik,

Thanks for the information. I understand that it is revolving around
q/q.alt/dismax but as per my need, I have to do some customization and I
have to use dismax for the same. That;s the reason , I keep asking different
questions about the same.

Below is the dismax configuration from SolrConfig file.

 
 
  
  false
  
  true
  
  10
  
 explicit 
  
 0.01 
 isbn10_product_s^1.0 isbn13_product_s^1.0
Index_Type_s^1.0 prdMainTitle_s^1.0 productURL_s^10.0
prdMainTitle_product_s^1.0 categoryIds_product_s^1.0 imprint_product_s^1.0
strapline_product_s^1.0 subject_product_s^1.0 prdPubDate_product_s^1.0
readBy_product_s^1.0 aluminator_product_s^1.0 editor_product_s^1.0
productType_product_s^1.0 authorLastName_product_s^1.0 edition_product_s^1.0
discipline_product_s^1.0 copyrightYear_product_s^1.0 courseId_course_s^1.0
indexType_course_s^1.0 courseType_course_s^1.0
courseJacketImage_course_s^1.0 sourceGroupName_course_s^1.0
subCompany_course_s^1.0 courseCodeSeq_course_s^1.0 discCode_course_s^1.0
displayName_course_s^1.0 programId_program_s^1.0 indexType_program_s^1.0
programType_program_s^1.0 groupNm_program_s^1.0 introText_program_s^1.0
programJacketImage_program_s^1.0 

  
 
  
 


 english^90 hindi^123 Glorious^2000 highlighting^1000
maths^100 ab^12 erer^4545 prdMainTitle_s^10.0 productURL_s^1.0
  
*,score 
  
 
 
  spellcheck

  


Debug Query with field Operations 


−

0
31
−

10
0
on
productURL_s:amit OR prdMainTitle_s:amitg
dismaxrequest
true
2.2


−

−

9.245665E-8
987644333221
987644333221
amit
amit

−

9.245665E-8
9876533221
9876533221
amitg
amitg


−

productURL_s:amit OR prdMainTitle_s:amitg
productURL_s:amit OR prdMainTitle_s:amitg
−

+(productURL_s:amit prdMainTitle_s:amitg) () all:english^90.0
all:hindi^123.0 all:glorious^2000.0 all:highlight^1.0E7 all:math^100.0
all:ab^12.0 all:erer^4545.0 MultiPhraseQuery(all:"(prd prd main prd main
titl prd main titl s) (main main titl main titl s) (titl titl s) s"^10.0)
MultiPhraseQuery(all:"(product product url product url s) (url url s)
s"^1.0)

−

+(productURL_s:amit prdMainTitle_s:amitg) () all:english^90.0
all:hindi^123.0 all:glorious^2000.0 all:highlight^1.0E7 all:math^100.0
all:ab^12.0 all:erer^4545.0 all:"(prd prd main prd main titl prd main titl
s) (main main titl main titl s) (titl titl s) s"^10.0 all:"(product product
url product url s) (url url s) s"^1.0

−

−


9.245665E-8 = (MATCH) sum of:
  9.245665E-8 = (MATCH) product of:
1.849133E-7 = (MATCH) sum of:
  1.849133E-7 = (MATCH) weight(productURL_s:amit in 6), product of:
7.748973E-8 = queryWeight(productURL_s:amit), product of:
  2.3862944 = idf(docFreq=1, numDocs=8)
  3.247283E-8 = queryNorm
2.3862944 = (MATCH) fieldWeight(productURL_s:amit in 6), product of:
  1.0 = tf(termFreq(productURL_s:amit)=1)
  2.3862944 = idf(docFreq=1, numDocs=8)
  1.0 = fieldNorm(field=productURL_s, doc=6)
0.5 = coord(1/2)

−


9.245665E-8 = (MATCH) sum of:
  9.245665E-8 = (MATCH) product of:
1.849133E-7 = (MATCH) sum of:
  1.849133E-7 = (MATCH) weight(prdMainTitle_s:amitg in 7), product of:
7.748973E-8 = queryWeight(prdMainTitle_s:amitg), product of:
  2.3862944 = idf(docFreq=1, numDocs=8)
  3.247283E-8 = queryNorm
2.3862944 = (MATCH) fieldWeight(prdMainTitle_s:amitg in 7), product
of:
  1.0 = tf(termFreq(prdMainTitle_s:amitg)=1)
  2.3862944 = idf(docFreq=1, numDocs=8)
  1.0 = fieldNorm(field=prdMainTitle_s, doc=7)
0.5 = coord(1/2)


DismaxQParser

−

−

english^90 hindi^123 Glorious^2000 highlighting^1000 maths^100 ab^12
erer^4545 prdMainTitle_s^10.0 productURL_s^1.0


−

−

all:english^90.0 all:hindi^123.0 all:glorious^2000.0 all:highlight^1.0E7
all:math^100.0 all:ab^12.0 all:erer^4545.0 MultiPhraseQuery(all:"(prd prd
main prd main titl prd main titl s) (main main titl main titl s) (titl titl
s) s"^10.0) MultiPhraseQuery(all:"(product product url product url s) (url
url s) s"^1.0)


−

 



−

31.0
−

0.0
−

0.0

−

0.0

−

0.0

−

0.0

−

0.0

−

0.0


−

31.0
−

15.0

−

0.0

−

0.0

−

0.0

−

0.0

−

16.0







I am able to achieve the field operations, wildcard with q query. But with
field operations, field boosting is not working.

If required, I can write you a separate email.
Please suggest.

Thanks,
Amit Garg




Erik Hatcher wrote:
> 
> 
> On Mar 5, 2009, at 8:31 AM, dabboo wrote:
>> I am implementing column specific search with q query parameter. I  
>> have
>> achieved the same but field boosting is not working in that.
>>
>> Below is the query which is getting formed for this URL:
>>
>> /?q=productURL_s:amit%20OR 
>> %20prdMainTitle_s:amitg 
>> &version=2.2&start=0&rows=10&indent=on&qt=dismaxrequest
>>
>> Query:
>>
>> productURL_s:amit prdMainTitle_s:amitg
>>
>>

Distributed Search in multcore scenario.

2009-03-05 Thread Sagar Khetkade

Hi,
 
I have multi-core scenario where the schemas are different and I have to search 
for these cores as per the use case. I am using distributed search approach 
here for getting the search results for the query from these cores. 
But there is an obstacle. I have used EmbbededSolrServer class of Solrj  for 
getting the solr server. As I have to pass the shardes  in the from of URL to 
the CommonHttpServer which I am unable to pass( as I am not using the HTTP 
approach). Instead I have tried shards as the server instance for the 
multi-core which I giving me error as  “Invalid uri 'http:// core0/select': 
Invalid authority”.
Can anybody suggest me the right approach having distributed search using 
multi-core scenario as well as the EmbedderSolrServer.
 
Thanks and Regards,
Sagar Khetkade
_
How fun is this? IMing with Windows Live Messenger just got better.
http://www.microsoft.com/india/windows/windowslive/messenger.aspx

Re: index multi valued field into multiple fields

2009-03-05 Thread Ashish P

OK. so basically what you are saying is when you use copyField, it will copy
the whole data from one field to other many fields but it can not copy part
of data to other field.
Because within same tokenizing ( when I am tokenizing "condition" field ) I
want part of data to go into content field and part of data to go into
tsdatetime field. But that looks like not possible.
The field "condition" is actually mix of multiple data values.


Shalin Shekhar Mangar wrote:
> 
> On Fri, Mar 6, 2009 at 7:40 AM, Ashish P  wrote:
> 
>>
>> I have a multi valued field as follows:
>> >
>> name="condition">
>>
>> I want to index the data from this field into following fields
>> 
>> 
>>
>> How can this be done?? Any ideas...
> 
> 
> Use a copyField (look at the schema shipped with solr for an example).
> 
> However, you can copy individual values from a multi-valued field into two
> different fields. The copyField of a multi-valued field should also be
> multi-valued otherwise it will retain only the last value.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/index-multi-valued-field-into-multiple-fields-tp22364915p22366393.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: search on date field

2009-03-05 Thread Ashish P

It works 
thanks
Ashish

Shalin Shekhar Mangar wrote:
> 
> On Fri, Mar 6, 2009 at 7:03 AM, Ashish P  wrote:
> 
>>
>> I want to search on single date field
>> e.g. q=creationDate:2009-01-24T15:00:00.000Z&rows=10
>>
>> But I think the query gets terminated after T15 as ':' ( COLON ) is taken
>> as
>> termination character.
>>
>> Any ideas on how to search on single date or for that matter if query
>> data
>> contains COLON then how to search.
> 
> 
> You can escape the ':' character by preceding it with a back slash '\'.
> e,g,
> 
> q=creationDate:2009-01-24T15\:00\:00.000Z
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-search-on-date-field-tp22365586p22366358.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Tony Wang
But I think this question should remain on the Solr user mail list, as I am
interested in finding a crawler that works for Solr and it doesn't
necessarily have to be Nutch.
Tony

On Thu, Mar 5, 2009 at 9:07 PM, Otis Gospodnetic  wrote:

>
> Tony,
>
> I suggest you pick one place to get help with Nutch+Solr, since it looks
> like you are jumping between lists and not having much luck. :)
> I suggest you stick to the Nutch list, since that's where the integration
> is coming from.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Tony Wang 
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, March 5, 2009 6:32:57 PM
> > Subject: what crawler do you use for Solr indexing?
> >
> > Hi,
> >
> > I wonder if there's any open source crawler product that could be
> integrated
> > with Solr. What crawler do you guys use? or you coded one by yourself? I
> > have been trying to find out solutions for Nutch/Solr integration, but
> > haven't got any luck yet.
> >
> > Could someone shed me some light?
> >
> > thanks!
> >
> > Tony
> >
> > --
> > Are you RCholic? www.RCholic.com
> > 温 良 恭 俭 让 仁 义 礼 智 信
>
>


-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Tony Wang
Thanks Otis.

On Thu, Mar 5, 2009 at 9:07 PM, Otis Gospodnetic  wrote:

>
> Tony,
>
> I suggest you pick one place to get help with Nutch+Solr, since it looks
> like you are jumping between lists and not having much luck. :)
> I suggest you stick to the Nutch list, since that's where the integration
> is coming from.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Tony Wang 
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, March 5, 2009 6:32:57 PM
> > Subject: what crawler do you use for Solr indexing?
> >
> > Hi,
> >
> > I wonder if there's any open source crawler product that could be
> integrated
> > with Solr. What crawler do you guys use? or you coded one by yourself? I
> > have been trying to find out solutions for Nutch/Solr integration, but
> > haven't got any luck yet.
> >
> > Could someone shed me some light?
> >
> > thanks!
> >
> > Tony
> >
> > --
> > Are you RCholic? www.RCholic.com
> > 温 良 恭 俭 让 仁 义 礼 智 信
>
>


-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


supported document types

2009-03-05 Thread Ashish P

What are the types of documents types ( Mime types ) supported for indexing
and searching in solr.
-- 
View this message in context: 
http://www.nabble.com/supported-document-types-tp22366114p22366114.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Otis Gospodnetic

Tony,

I suggest you pick one place to get help with Nutch+Solr, since it looks like 
you are jumping between lists and not having much luck. :)
I suggest you stick to the Nutch list, since that's where the integration is 
coming from.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Tony Wang 
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 5, 2009 6:32:57 PM
> Subject: what crawler do you use for Solr indexing?
> 
> Hi,
> 
> I wonder if there's any open source crawler product that could be integrated
> with Solr. What crawler do you guys use? or you coded one by yourself? I
> have been trying to find out solutions for Nutch/Solr integration, but
> haven't got any luck yet.
> 
> Could someone shed me some light?
> 
> thanks!
> 
> Tony
> 
> -- 
> Are you RCholic? www.RCholic.com
> 温 良 恭 俭 让 仁 义 礼 智 信



Re: index multi valued field into multiple fields

2009-03-05 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 7:40 AM, Ashish P  wrote:

>
> I have a multi valued field as follows:
> 
> name="condition">
>
> I want to index the data from this field into following fields
> 
> 
>
> How can this be done?? Any ideas...


Use a copyField (look at the schema shipped with solr for an example).

However, you can copy individual values from a multi-valued field into two
different fields. The copyField of a multi-valued field should also be
multi-valued otherwise it will retain only the last value.

-- 
Regards,
Shalin Shekhar Mangar.


Re: search on date field

2009-03-05 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 7:03 AM, Ashish P  wrote:

>
> I want to search on single date field
> e.g. q=creationDate:2009-01-24T15:00:00.000Z&rows=10
>
> But I think the query gets terminated after T15 as ':' ( COLON ) is taken
> as
> termination character.
>
> Any ideas on how to search on single date or for that matter if query data
> contains COLON then how to search.


You can escape the ':' character by preceding it with a back slash '\'. e,g,

q=creationDate:2009-01-24T15\:00\:00.000Z
-- 
Regards,
Shalin Shekhar Mangar.


Re: commit / new searcher delay?

2009-03-05 Thread Steve Conover
That's exactly what I'm doing, but I'm explicitly replicating, and
committing.  Even under these circumstances, what could explain the
delay after commit before the new index becomes available?

On Thu, Mar 5, 2009 at 10:55 AM, Shalin Shekhar Mangar
 wrote:
> On Thu, Mar 5, 2009 at 10:30 PM, Steve Conover  wrote:
>
>> Yep, I notice the default is true/true, but I explicitly specified
>> both those things too and there's no difference in behavior.
>>
>
> Perhaps you are indexing on the master and then searching on the slaves? It
> may be the delay introduced by replication.
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Tony Wang
Hi Nick -

Could you please teach me a little bit how to make Nutch work for Solr?
Thanks a lot!

Tony

On Thu, Mar 5, 2009 at 5:01 PM, Nick Tkach  wrote:

> Yes, Nutch works quite well as a crawler for Solr.
>
> - Original Message -
> From: "Tony Wang" 
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 5, 2009 5:32:57 PM GMT -06:00 US/Canada Central
> Subject: what crawler do you use for Solr indexing?
>
> Hi,
>
> I wonder if there's any open source crawler product that could be
> integrated
> with Solr. What crawler do you guys use? or you coded one by yourself? I
> have been trying to find out solutions for Nutch/Solr integration, but
> haven't got any luck yet.
>
> Could someone shed me some light?
>
> thanks!
>
> Tony
>
> --
> Are you RCholic? www.RCholic.com
> 温 良 恭 俭 让 仁 义 礼 智 信
>



-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


index multi valued field into multiple fields

2009-03-05 Thread Ashish P

I have a multi valued field as follows:


I want to index the data from this field into following fields



How can this be done?? Any ideas...
-- 
View this message in context: 
http://www.nabble.com/index-multi-valued-field-into-multiple-fields-tp22364915p22364915.html
Sent from the Solr - User mailing list archive at Nabble.com.



search on date field

2009-03-05 Thread Ashish P

I want to search on single date field
e.g. q=creationDate:2009-01-24T15:00:00.000Z&rows=10

But I think the query gets terminated after T15 as ':' ( COLON ) is taken as
termination character.

Any ideas on how to search on single date or for that matter if query data
contains COLON then how to search. 
-- 
View this message in context: 
http://www.nabble.com/search-on-date-field-tp22364587p22364587.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: public apology for company spam

2009-03-05 Thread Mike Klaas

On 5-Mar-09, at 6:47 AM, Yonik Seeley wrote:


This morning, an apparently over-zealous marketing firm, on behalf of
the company I work for, sent out a marketing email to a large number
of subscribers of the Lucene email lists.  This was done without my
knowledge or approval, and I can assure you that I'll make all efforts
to prevent it from happening again.


It would be forgivable if only the email didn't contain the  
misspelling "Lucen" :)


-Mike


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Nick Tkach
Yes, Nutch works quite well as a crawler for Solr.

- Original Message -
From: "Tony Wang" 
To: solr-user@lucene.apache.org
Sent: Thursday, March 5, 2009 5:32:57 PM GMT -06:00 US/Canada Central
Subject: what crawler do you use for Solr indexing?

Hi,

I wonder if there's any open source crawler product that could be integrated
with Solr. What crawler do you guys use? or you coded one by yourself? I
have been trying to find out solutions for Nutch/Solr integration, but
haven't got any luck yet.

Could someone shed me some light?

thanks!

Tony

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: what crawler do you use for Solr indexing?

2009-03-05 Thread Baalman, Laura A. (ARC-TI)[QSS GROUP INC]
We are using Heritrix, the Internet Archive’s open source crawler, which is 
very easy to extend. We have augmented it with a custom parser to crawl some 
specific data formats and coded our own processors (Heritrix’s terminology for 
extensions) to link together different data sources as well as to output xmls 
in the right format to feed to solr. We have not yet created an automated path 
to feed the xmls into solr but we plan to.

~LB



On 3/5/09 3:32 PM, "Tony Wang"  wrote:

Hi,

I wonder if there's any open source crawler product that could be integrated
with Solr. What crawler do you guys use? or you coded one by yourself? I
have been trying to find out solutions for Nutch/Solr integration, but
haven't got any luck yet.

Could someone shed me some light?

thanks!

Tony

--
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信



what crawler do you use for Solr indexing?

2009-03-05 Thread Tony Wang
Hi,

I wonder if there's any open source crawler product that could be integrated
with Solr. What crawler do you guys use? or you coded one by yourself? I
have been trying to find out solutions for Nutch/Solr integration, but
haven't got any luck yet.

Could someone shed me some light?

thanks!

Tony

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: DataImportHandler and delta-import question

2009-03-05 Thread Garafola Timothy
Thanks.  Can you recommend a build I can try?

On Thu, Mar 5, 2009 at 3:09 PM, Marc Sturlese  wrote:
>
> I am not sure if RollBackUpdateCommand was yet developed in the oficial solr
> 1.3 release. I think it's just in the nightly builds. Looks like your
> dataimport package is too new. I think you should try to use that dataimport
> release with a solr nightly or try to grab an older dataimport release.
>
>
> Tim Garafola wrote:
>>
>> I tried updating the solr instance I'm testing DIH with, adding the
>> the dataimport and slf4j jar files to solr.
>>
>> When I start solr, I get the following error.  Is there something else
>> which needs to be installed for the nightly build version of DIH to
>> work in solr release 1.3?
>>
>> Thanks,
>> Tim
>>
>>
>> java.lang.NoClassDefFoundError:
>> org/apache/solr/update/RollbackUpdateCommand
>>       at
>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95)
>>       at
>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311)
>>       at org.apache.solr.core.SolrCore.(SolrCore.java:480)
>>       at
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>>       at
>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>       at
>> com.caucho.server.dispatch.FilterManager.createFilter(FilterManager.java:134)
>>       at com.caucho.server.dispatch.FilterManager.init(FilterManager.java:87)
>>       at com.caucho.server.webapp.Application.start(Application.java:1655)
>>       at
>> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>>       at
>> com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
>>       at
>> com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
>>       at
>> com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
>>       at
>> com.caucho.server.webapp.ApplicationContainer.start(ApplicationContainer.java:670)
>>       at com.caucho.server.host.Host.start(Host.java:420)
>>       at
>> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>>       at
>> com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
>>       at
>> com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
>>       at
>> com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
>>       at com.caucho.server.host.HostContainer.start(HostContainer.java:504)
>>       at com.caucho.server.resin.ServletServer.start(ServletServer.java:971)
>>       at
>> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>>       at
>> com.caucho.server.deploy.AbstractDeployControllerStrategy.start(AbstractDeployControllerStrategy.java:56)
>>       at
>> com.caucho.server.deploy.DeployController.start(DeployController.java:517)
>>       at com.caucho.server.resin.ResinServer.start(ResinServer.java:551)
>>       at com.caucho.server.resin.Resin.init(Resin.java)
>>       at com.caucho.server.resin.Resin.main(Resin.java:625)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.solr.update.RollbackUpdateCommand
>>       at
>> com.caucho.loader.DynamicClassLoader.findClass(DynamicClassLoader.java:1130)
>>       at
>> com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1072)
>>       at
>> com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1021)
>>       at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
>>       ... 26 more
>>
>>
>> On Thu, Mar 5, 2009 at 9:10 AM, Garafola Timothy 
>> wrote:
>>> yes, the dataimport.properties file is present in the conf directory
>>> from previous imports.  I'll try the trunk version as you suggested to
>>> see if the problem persists.
>>>
>>> Thanks,
>>> Tim
>>>
>>> On Wed, Mar 4, 2009 at 7:54 PM, Noble Paul നോബിള്‍  नोब्ळ्
>>>  wrote:
 the dataimport.properties is created only after one successful import
 .so it is available only from second import onwards. probably you can
 create one manually and put it in the conf dir.

 On Thu, Mar 5, 2009 at 12:52 AM, Garafola Timothy
  wrote:
> Thanks,
>
> I set up a another test instance of solr and ran a full import within
> the DIH Development Console.  I examined the query and found that
> last_index_time is not getting set in the query.  Yet the value does
> get updated after a full import completes (outside of the development
> console).  Is there some place that I need to set the path to the
> dataimport.properties file?
>
> On Tue, Mar 3, 2009 at 8:03 PM, Noble Paul നോബിള്‍  नोब्ळ्
>  wrote:
>> I do not see anything wrong with this .It should have worked . Can you
>> check that dataimport.properties is created (by DIH) in the conf
>> directory? . check the content?
>>
>>
>> are you sure that the query

Re: DataImportHandler and delta-import question

2009-03-05 Thread Marc Sturlese

I am not sure if RollBackUpdateCommand was yet developed in the oficial solr
1.3 release. I think it's just in the nightly builds. Looks like your
dataimport package is too new. I think you should try to use that dataimport
release with a solr nightly or try to grab an older dataimport release.


Tim Garafola wrote:
> 
> I tried updating the solr instance I'm testing DIH with, adding the
> the dataimport and slf4j jar files to solr.
> 
> When I start solr, I get the following error.  Is there something else
> which needs to be installed for the nightly build version of DIH to
> work in solr release 1.3?
> 
> Thanks,
> Tim
> 
> 
> java.lang.NoClassDefFoundError:
> org/apache/solr/update/RollbackUpdateCommand
>   at
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95)
>   at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311)
>   at org.apache.solr.core.SolrCore.(SolrCore.java:480)
>   at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
>   at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>   at
> com.caucho.server.dispatch.FilterManager.createFilter(FilterManager.java:134)
>   at com.caucho.server.dispatch.FilterManager.init(FilterManager.java:87)
>   at com.caucho.server.webapp.Application.start(Application.java:1655)
>   at
> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>   at
> com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
>   at
> com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
>   at
> com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
>   at
> com.caucho.server.webapp.ApplicationContainer.start(ApplicationContainer.java:670)
>   at com.caucho.server.host.Host.start(Host.java:420)
>   at
> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>   at
> com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
>   at
> com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
>   at
> com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
>   at com.caucho.server.host.HostContainer.start(HostContainer.java:504)
>   at com.caucho.server.resin.ServletServer.start(ServletServer.java:971)
>   at
> com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
>   at
> com.caucho.server.deploy.AbstractDeployControllerStrategy.start(AbstractDeployControllerStrategy.java:56)
>   at
> com.caucho.server.deploy.DeployController.start(DeployController.java:517)
>   at com.caucho.server.resin.ResinServer.start(ResinServer.java:551)
>   at com.caucho.server.resin.Resin.init(Resin.java)
>   at com.caucho.server.resin.Resin.main(Resin.java:625)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.solr.update.RollbackUpdateCommand
>   at
> com.caucho.loader.DynamicClassLoader.findClass(DynamicClassLoader.java:1130)
>   at
> com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1072)
>   at
> com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1021)
>   at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
>   ... 26 more
> 
> 
> On Thu, Mar 5, 2009 at 9:10 AM, Garafola Timothy 
> wrote:
>> yes, the dataimport.properties file is present in the conf directory
>> from previous imports.  I'll try the trunk version as you suggested to
>> see if the problem persists.
>>
>> Thanks,
>> Tim
>>
>> On Wed, Mar 4, 2009 at 7:54 PM, Noble Paul നോബിള്‍  नोब्ळ्
>>  wrote:
>>> the dataimport.properties is created only after one successful import
>>> .so it is available only from second import onwards. probably you can
>>> create one manually and put it in the conf dir.
>>>
>>> On Thu, Mar 5, 2009 at 12:52 AM, Garafola Timothy
>>>  wrote:
 Thanks,

 I set up a another test instance of solr and ran a full import within
 the DIH Development Console.  I examined the query and found that
 last_index_time is not getting set in the query.  Yet the value does
 get updated after a full import completes (outside of the development
 console).  Is there some place that I need to set the path to the
 dataimport.properties file?

 On Tue, Mar 3, 2009 at 8:03 PM, Noble Paul നോബിള്‍  नोब्ळ्
  wrote:
> I do not see anything wrong with this .It should have worked . Can you
> check that dataimport.properties is created (by DIH) in the conf
> directory? . check the content?
>
>
> are you sure that the query
>
> select DId from 2_Doc where ModifiedDate >
> '${dataimporter.last_index_time}'
>
> works with  a date format -MM-dd HH:mm:ss . This is the format
> which DIH sends the date in . 

Re: sorl-ruby facet.method

2009-03-05 Thread Ian Connor
If you want this to switch groups, let me know...

...but it would be good to know benchmarkng like in ActiveRecord so that you
know for a request how much time was spent where.

ActiveRecord logs look like:

JournalAccess Columns (0.107016)   SHOW FIELDS FROM `journal_accesses`
Transform Load (0.107634)   SELECT * FROM `transforms` WHERE
(`transforms`.`name` = 'MIT') LIMIT 1


It would be nice if Solr could do the same to pinpoint slow queries so they
could look like this (in an appropriate Solr color):
Solr Delete (0.107016)   id = "123123"
Solr Query (0.107634)   q=*:*&rows=10

I am not sure how hard this would be to plug into the rails logger and if
you could then also make it add up at the end in the summary line for the
request.

However, it certainly would be nice to know and help focus performance
debugging.

On Thu, Mar 5, 2009 at 4:33 PM, Ian Connor  wrote:

> I agree that remapping params is not that much fun. I certainly vote for
> just passing them through and it will be easier to keep up with the latest
> as well.
>
> I created:
>
> https://issues.apache.org/jira/browse/SOLR-1047
>
> Let me know if there is something else I can do to help.
>
>
> On Thu, Mar 5, 2009 at 3:19 PM, Erik Hatcher 
> wrote:
>
>> First, note we have a ruby-...@lucene.apache.org list which focuses
>> primarily on the solr-ruby library, flare, and other Ruby specific things.
>>  But this forum is as good as any, though I'm CC'ing ruby-dev too.
>>
>> On Mar 5, 2009, at 12:59 PM, Ian Connor wrote:
>>
>>> Is there a way to specify the facet.method using solr-ruby. I tried to
>>> add
>>> it like this:
>>>
>>> hash["facet.method"] = @params[:facets][:method] if
>>> @params[:facets][:method]
>>>
>>
>> That's a reasonable addition, however we're about to do a refactoring of
>> solr-ruby to bring in the great contributions Matt Mitchell has been doing
>> with his RSolr project.  We're going to strip away all the parameter
>> mangling/mapping and just simply pass through parameters to Solr (and leave
>> clever mapping of things like :query -> &q= to folks that want to add that
>> construct to their own applications).
>>
>>  to line 78 of standard.rb and it works when you add it to the facets
>>> Hash.
>>> However, if there is another place that I could set that would be great.
>>>
>>
>> One option is to provide your own request/response classes, subclassing
>> Solr::Request(and Response)::Standard if you want to just hack this for now.
>>
>>  I am also happy to submit a patch on a ticket if that works.
>>>
>>
>> The above being said, I'd gladly commit your patch right away.  Submit it
>> via JIRA and consider it done.   We'll do one final release (0.9?) of the
>> current solr-ruby library before we gut it and simplify it (and do our best
>> to provide backwards compatibility to the 0.x versions) for a 1.0 version.
>>
>>Erik
>>
>>
>
>
> --
> Regards,
>
> Ian Connor
>



-- 
Regards,

Ian Connor


Re: DataImportHandler and delta-import question

2009-03-05 Thread Garafola Timothy
I tried updating the solr instance I'm testing DIH with, adding the
the dataimport and slf4j jar files to solr.

When I start solr, I get the following error.  Is there something else
which needs to be installed for the nightly build version of DIH to
work in solr release 1.3?

Thanks,
Tim


java.lang.NoClassDefFoundError: org/apache/solr/update/RollbackUpdateCommand
at 
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95)
at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311)
at org.apache.solr.core.SolrCore.(SolrCore.java:480)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at 
com.caucho.server.dispatch.FilterManager.createFilter(FilterManager.java:134)
at com.caucho.server.dispatch.FilterManager.init(FilterManager.java:87)
at com.caucho.server.webapp.Application.start(Application.java:1655)
at 
com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
at 
com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
at 
com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
at 
com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
at 
com.caucho.server.webapp.ApplicationContainer.start(ApplicationContainer.java:670)
at com.caucho.server.host.Host.start(Host.java:420)
at 
com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
at 
com.caucho.server.deploy.StartAutoRedeployAutoStrategy.startOnInit(StartAutoRedeployAutoStrategy.java:72)
at 
com.caucho.server.deploy.DeployController.startOnInit(DeployController.java:509)
at 
com.caucho.server.deploy.DeployContainer.start(DeployContainer.java:153)
at com.caucho.server.host.HostContainer.start(HostContainer.java:504)
at com.caucho.server.resin.ServletServer.start(ServletServer.java:971)
at 
com.caucho.server.deploy.DeployController.startImpl(DeployController.java:621)
at 
com.caucho.server.deploy.AbstractDeployControllerStrategy.start(AbstractDeployControllerStrategy.java:56)
at 
com.caucho.server.deploy.DeployController.start(DeployController.java:517)
at com.caucho.server.resin.ResinServer.start(ResinServer.java:551)
at com.caucho.server.resin.Resin.init(Resin.java)
at com.caucho.server.resin.Resin.main(Resin.java:625)
Caused by: java.lang.ClassNotFoundException:
org.apache.solr.update.RollbackUpdateCommand
at 
com.caucho.loader.DynamicClassLoader.findClass(DynamicClassLoader.java:1130)
at 
com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1072)
at 
com.caucho.loader.DynamicClassLoader.loadClass(DynamicClassLoader.java:1021)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
... 26 more


On Thu, Mar 5, 2009 at 9:10 AM, Garafola Timothy  wrote:
> yes, the dataimport.properties file is present in the conf directory
> from previous imports.  I'll try the trunk version as you suggested to
> see if the problem persists.
>
> Thanks,
> Tim
>
> On Wed, Mar 4, 2009 at 7:54 PM, Noble Paul നോബിള്‍  नोब्ळ्
>  wrote:
>> the dataimport.properties is created only after one successful import
>> .so it is available only from second import onwards. probably you can
>> create one manually and put it in the conf dir.
>>
>> On Thu, Mar 5, 2009 at 12:52 AM, Garafola Timothy  
>> wrote:
>>> Thanks,
>>>
>>> I set up a another test instance of solr and ran a full import within
>>> the DIH Development Console.  I examined the query and found that
>>> last_index_time is not getting set in the query.  Yet the value does
>>> get updated after a full import completes (outside of the development
>>> console).  Is there some place that I need to set the path to the
>>> dataimport.properties file?
>>>
>>> On Tue, Mar 3, 2009 at 8:03 PM, Noble Paul നോബിള്‍  नोब्ळ्
>>>  wrote:
 I do not see anything wrong with this .It should have worked . Can you
 check that dataimport.properties is created (by DIH) in the conf
 directory? . check the content?


 are you sure that the query

 select DId from 2_Doc where ModifiedDate > 
 '${dataimporter.last_index_time}'

 works with  a date format -MM-dd HH:mm:ss . This is the format
 which DIH sends the date in . If the format is wrong you may need to
 format it using a dateformat function.

 see here

 http://wiki.apache.org/solr/DataImportHandler#head-5675e913396a42eb7c6c5d3c894ada5dadbb62d7


  The trunk DIH can work with Solr1.3 (you may need to put the DIH jar
 and slf4j). Can
 - Show quoted text -
 On Wed, Mar 4, 2009 at 3:53 AM, Garafola Timothy  
 wrote:
> I'm us

Re: jetty vs tomcat

2009-03-05 Thread Ian Connor
Hi,

At Pubget we are also happy with jetty (distributed over a number of shards
and just adding more this week).

Just search around for a good init.d script to start it up, and we use monit
to keep it up:

init.d snippet:

START_COMMAND="java -Dsolr.data.dir=/solr8983 -Djetty.port=8983
-DSTOP.PORT=8079 -DSTOP.KEY=solrprod -Xms512M -Xmx1024M -jar start.jar"
STOP_COMMAND="java -Dsolr.data.dir=/solr8983 -Djetty.port=8983
-DSTOP.PORT=8079 -DSTOP.KEY=solrprod -Xms512M -Xmx1024M -jar start.jar
--stop"

start() {
  echo -n "Starting $NAME"
  cd $SOLR_HOME
  rm -f /solr8983/index/lucene-*-write.lock
  $START_COMMAND 2> $LOG_FILE &
  sleep 2
  echo `ps -ef | grep -v grep | grep "$START_COMMAND" | awk '{print $2}'` >
$PIDFILE
  echo "Done"
  return 0
}

stop() {
  echo -n "Stopping $NAME"
  cd $SOLR_HOME
  $STOP_COMMAND &
  pkill -9 -f solr8983
  rm -f $PIDFILE
  echo "Done"
  return 0
}


monit snippet:
check process solr.production with pidfile /solr8983/solr.production.pid
  group search
  if failed host localhost port 8983 then restart
  start program = "/etc/init.d/solr.production start"
  stop  program = "/etc/init.d/solr.production stop"
  if 5 restarts within 5 cycles then timeout


On Thu, Mar 5, 2009 at 3:02 PM, Glen Newton  wrote:

> Performance comparison link:
> - "Jetty vs Tomcat: A Comparative Analysis". prepared by Greg Wilkins
> - May, 2008.
> http://www.webtide.com/choose/jetty.jsp
>
>
> 2009/3/5 Erik Hatcher :
> > That being said... I don't think there is a strong reason to go out of
> your
> > way to install Tomcat and do the additional config.  I'd say just use
> Jetty
> > until you have some other reason not to.
> >
> > http://www.lucidimagination.com/search is currently powered by Jetty,
> and we
> > have no plans to switch.
> >
> >Erik
> >
> > On Mar 5, 2009, at 2:06 PM, Ryan McKinley wrote:
> >
> >> The jetty vs tomcat vs resin vs whatever question pretty much comes down
> >> to what you are comfortable running/managing.
> >>
> >> Solr tries its best to stay container agnostic.
> >>
> >>
> >> On Mar 5, 2009, at 1:55 PM, Jonathan Haddad wrote:
> >>
> >>> Is there any compelling reason to use tomcat instead of jetty if all
> >>> we're doing is using solr?  We don't use tomcat anywhere else.
> >>> --
> >>> Jonathan Haddad
> >>> http://www.rustyrazorblade.com
> >
> >
>
>
>
> --
>
> -
>



-- 
Regards,

Ian Connor


Re: sorl-ruby facet.method

2009-03-05 Thread Ian Connor
I agree that remapping params is not that much fun. I certainly vote for
just passing them through and it will be easier to keep up with the latest
as well.

I created:

https://issues.apache.org/jira/browse/SOLR-1047

Let me know if there is something else I can do to help.

On Thu, Mar 5, 2009 at 3:19 PM, Erik Hatcher wrote:

> First, note we have a ruby-...@lucene.apache.org list which focuses
> primarily on the solr-ruby library, flare, and other Ruby specific things.
>  But this forum is as good as any, though I'm CC'ing ruby-dev too.
>
> On Mar 5, 2009, at 12:59 PM, Ian Connor wrote:
>
>> Is there a way to specify the facet.method using solr-ruby. I tried to add
>> it like this:
>>
>> hash["facet.method"] = @params[:facets][:method] if
>> @params[:facets][:method]
>>
>
> That's a reasonable addition, however we're about to do a refactoring of
> solr-ruby to bring in the great contributions Matt Mitchell has been doing
> with his RSolr project.  We're going to strip away all the parameter
> mangling/mapping and just simply pass through parameters to Solr (and leave
> clever mapping of things like :query -> &q= to folks that want to add that
> construct to their own applications).
>
>  to line 78 of standard.rb and it works when you add it to the facets Hash.
>> However, if there is another place that I could set that would be great.
>>
>
> One option is to provide your own request/response classes, subclassing
> Solr::Request(and Response)::Standard if you want to just hack this for now.
>
>  I am also happy to submit a patch on a ticket if that works.
>>
>
> The above being said, I'd gladly commit your patch right away.  Submit it
> via JIRA and consider it done.   We'll do one final release (0.9?) of the
> current solr-ruby library before we gut it and simplify it (and do our best
> to provide backwards compatibility to the 0.x versions) for a 1.0 version.
>
>Erik
>
>


-- 
Regards,

Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor


Re: sorl-ruby facet.method

2009-03-05 Thread Erik Hatcher
First, note we have a ruby-...@lucene.apache.org list which focuses  
primarily on the solr-ruby library, flare, and other Ruby specific  
things.  But this forum is as good as any, though I'm CC'ing ruby-dev  
too.


On Mar 5, 2009, at 12:59 PM, Ian Connor wrote:
Is there a way to specify the facet.method using solr-ruby. I tried  
to add

it like this:

 hash["facet.method"] = @params[:facets][:method] if
@params[:facets][:method]


That's a reasonable addition, however we're about to do a refactoring  
of solr-ruby to bring in the great contributions Matt Mitchell has  
been doing with his RSolr project.  We're going to strip away all the  
parameter mangling/mapping and just simply pass through parameters to  
Solr (and leave clever mapping of things like :query -> &q= to folks  
that want to add that construct to their own applications).


to line 78 of standard.rb and it works when you add it to the facets  
Hash.
However, if there is another place that I could set that would be  
great.


One option is to provide your own request/response classes,  
subclassing Solr::Request(and Response)::Standard if you want to just  
hack this for now.



I am also happy to submit a patch on a ticket if that works.


The above being said, I'd gladly commit your patch right away.  Submit  
it via JIRA and consider it done.   We'll do one final release (0.9?)  
of the current solr-ruby library before we gut it and simplify it (and  
do our best to provide backwards compatibility to the 0.x versions)  
for a 1.0 version.


Erik



Re: jetty vs tomcat

2009-03-05 Thread Glen Newton
Performance comparison link:
- "Jetty vs Tomcat: A Comparative Analysis". prepared by Greg Wilkins
- May, 2008.
http://www.webtide.com/choose/jetty.jsp


2009/3/5 Erik Hatcher :
> That being said... I don't think there is a strong reason to go out of your
> way to install Tomcat and do the additional config.  I'd say just use Jetty
> until you have some other reason not to.
>
> http://www.lucidimagination.com/search is currently powered by Jetty, and we
> have no plans to switch.
>
>        Erik
>
> On Mar 5, 2009, at 2:06 PM, Ryan McKinley wrote:
>
>> The jetty vs tomcat vs resin vs whatever question pretty much comes down
>> to what you are comfortable running/managing.
>>
>> Solr tries its best to stay container agnostic.
>>
>>
>> On Mar 5, 2009, at 1:55 PM, Jonathan Haddad wrote:
>>
>>> Is there any compelling reason to use tomcat instead of jetty if all
>>> we're doing is using solr?  We don't use tomcat anywhere else.
>>> --
>>> Jonathan Haddad
>>> http://www.rustyrazorblade.com
>
>



-- 

-


Re: jetty vs tomcat

2009-03-05 Thread Erik Hatcher
That being said... I don't think there is a strong reason to go out of  
your way to install Tomcat and do the additional config.  I'd say just  
use Jetty until you have some other reason not to.


http://www.lucidimagination.com/search is currently powered by Jetty,  
and we have no plans to switch.


Erik

On Mar 5, 2009, at 2:06 PM, Ryan McKinley wrote:

The jetty vs tomcat vs resin vs whatever question pretty much comes  
down to what you are comfortable running/managing.


Solr tries its best to stay container agnostic.


On Mar 5, 2009, at 1:55 PM, Jonathan Haddad wrote:


Is there any compelling reason to use tomcat instead of jetty if all
we're doing is using solr?  We don't use tomcat anywhere else.
--
Jonathan Haddad
http://www.rustyrazorblade.com




Re: jetty vs tomcat

2009-03-05 Thread Ryan McKinley
The jetty vs tomcat vs resin vs whatever question pretty much comes  
down to what you are comfortable running/managing.


Solr tries its best to stay container agnostic.


On Mar 5, 2009, at 1:55 PM, Jonathan Haddad wrote:


Is there any compelling reason to use tomcat instead of jetty if all
we're doing is using solr?  We don't use tomcat anywhere else.
--
Jonathan Haddad
http://www.rustyrazorblade.com




Re: SOLR query.

2009-03-05 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 11:45 PM, Erik Hatcher wrote:

>
> On Mar 5, 2009, at 1:07 PM, Suryasnat Das wrote:
>
>> I have some queries on SOLR fo which i need immediate resolution. A fast
>> help would be greatly appreciated.
>>
>> a.) We know that fields are also indexed. So can we index some specific
>> fields(like author, id, etc) first and then do the indexing for rest of
>> the
>> fields(like creation date etc) at a later time.
>>
>
> You have to reindex the entire document in order to add fields to it, but
> you certainly can do so at any time.  In other words, you can just addfields 
> to an existing document without sending in all the fields you want on
> that document.
>

 I think Erik meant that you cannot add fields to an existing document
without sending in all the fields again.
-- 
Regards,
Shalin Shekhar Mangar.


jetty vs tomcat

2009-03-05 Thread Jonathan Haddad
Is there any compelling reason to use tomcat instead of jetty if all
we're doing is using solr?  We don't use tomcat anywhere else.
-- 
Jonathan Haddad
http://www.rustyrazorblade.com


Re: commit / new searcher delay?

2009-03-05 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 10:30 PM, Steve Conover  wrote:

> Yep, I notice the default is true/true, but I explicitly specified
> both those things too and there's no difference in behavior.
>

Perhaps you are indexing on the master and then searching on the slaves? It
may be the delay introduced by replication.
-- 
Regards,
Shalin Shekhar Mangar.


Re: How to search the database tables using solr.

2009-03-05 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 10:09 PM, Radha C.  wrote:

> I want to understand fully what is configured and how to configure , So I
> tried to index my local MySql DB directly with one simple table called
> persons in it.
> But I am getting lot of errors. the following are the steps I did,
> 1.   downloaded solr-2009-03-03.zip
> 
> distribution and extracted to d:/solrtemp/
> 2.   copied the example/solr to d:/solr ( this is my solr home ) for my
> application template.
> 3.   set this solr home insidecatalina-home/localconfig/solr.xml
> 4.   create the data-config under $solrhome/config
> here is my dataconfig file
> 
> url="jdbc:mysql://localhost:3306/test" user="xxx" password="" />
>
>
>
>
>
>
>
> 
>
>  I am not sure what needs to be done next. I am getting error when I start
> tomcat.


Just like you write DDL for databases, you need to define a schema in Solr
as well. The missing step is to change solr's schema.xml according to the
fields you want to index. Look at http://wiki.apache.org/solr/SchemaXml for
details.

The errors are a way of warning you that there are certain required fields
in schema which are never filled by the configuration in the data-config.


>
>
> Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImportHandler
> processConfiguration
> INFO: Processing configuration from solrconfig.xml:
> {config=data-config.xml}
> Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImporter
> loadDataConfig
> INFO: Data Configuration loaded successfully
> Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImporter
> verifyWithSchema
> INFO: id is a required field in SolrSchema . But not found in DataConfig
> Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImportHandler
> inform
> SEVERE: Exception while loading DataImporter
> org.apache.solr.handler.dataimport.DataImportHandlerException: There are
> errors in the Schema
> The field :age present in DataConfig does not have a counterpart in Solr
> Schema
> The field :firstname present in DataConfig does not have a counterpart in
> Solr Schema
> The field :personid present in DataConfig does not have a counterpart in
> Solr Schema
> The field :lastName present in DataConfig does not have a counterpart in
> Solr Schema
>at
>
> org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:108
> )
>at
>
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
> r.java:95)
>at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>at
>
> org.apache.solr.core.SolrCore.(SolrCore.java:571).
> .
> Mar 5, 2009 9:10:28 PM org.apache.solr.servlet.SolrDispatchFilter init
> SEVERE: Could not start SOLR. Check solr/home property
> org.apache.solr.common.SolrException: FATAL: Could not create importer.
> DataImporter config invalid
>at
>
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
> r.java:103)
>at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>
>at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> There are errors in the Schema
> The field :age present in DataConfig does not have a counterpart in Solr
> Schema
> The field :firstname present in DataConfig does not have a counterpart in
> Solr Schema
> The field :personid present in DataConfig does not have a counterpart in
> Solr Schema
> The field :lastName present in DataConfig does not have a counterpart in
> Solr Schema
>at
>
> org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:108
> )
>at
>
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
> r.java:95)
>... 31 more
> Mar 5, 2009 9:10:28 PM org.apache.solr.core.QuerySenderListener newSearcher
>
> I want to write a small seperate application without relying on examples,
> can you provide me some useful steps how to do that.
>
> Thanks
>
>  _
>
> From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
> Sent: Thursday, March 05, 2009 5:16 PM
> To: solr-user@lucene.apache.org; cra...@ceiindia.com
> Subject: Re: How to search the database tables using solr.
>
>
> On Thu, Mar 5, 2009 at 4:42 PM, Radha C.  wrote:
>
>
>
> Hi,
>
> I am newbie for solr search engin. I don't find any juicy information on
> how
> to configure the ORACLE data base to index the tables using solr search
> engin. There are huge documents spread over wiki pages. I need some core
> information.
> I am using Apache-tomcat 5.5.26, and oracle 9i. Can you please provide me a
> brief details for installing and configuring the database with solr and how
> to indexing the rows. Your help will really save lot of my time.
>
>
>
>
> Have you been able to run the example/example-DIH d

Re: SOLR query.

2009-03-05 Thread Erik Hatcher


On Mar 5, 2009, at 1:07 PM, Suryasnat Das wrote:
I have some queries on SOLR fo which i need immediate resolution. A  
fast

help would be greatly appreciated.

a.) We know that fields are also indexed. So can we index some  
specific
fields(like author, id, etc) first and then do the indexing for rest  
of the

fields(like creation date etc) at a later time.


You have to reindex the entire document in order to add fields to it,  
but you certainly can do so at any time.  In other words, you can just  
add fields to an existing document without sending in all the fields  
you want on that document.


b.) SOLR returns the whole text content of a file during a search  
operation.
So how can we extract a portion of the whole content? I mean a  
snippet of
the content containing that search keyword. Sample code would be of  
great

help.


Use Solr's highlighting capabilities:  



c.) What is multi core indexing?


Separate Solr/Lucene indexes, that all are served from a single  
instance of Solr.


d.) What is the number of index files that are normally created in a  
index

operation?


Depends on the number of fields, and how you have the index  
configuration set.  If file handles ever become a problem you can set  
it to use the compound file format, but in practice I've never seen it  
be a problem.



What will be the expected number of index files when i index a 4
tera byte of filedata and what will be the index size for all the  
index
files? If anybody has worked nsuch huge volume of data then some  
pointers

would be of great help.


The rule of thumb is that a Lucene index is roughly 35% the size of  
the original text, assuming you are not storing the fields in Lucene,  
but only indexing it.


Erik



SOLR query.

2009-03-05 Thread Suryasnat Das
Hi,

I have some queries on SOLR fo which i need immediate resolution. A fast
help would be greatly appreciated.

a.) We know that fields are also indexed. So can we index some specific
fields(like author, id, etc) first and then do the indexing for rest of the
fields(like creation date etc) at a later time.

b.) SOLR returns the whole text content of a file during a search operation.
So how can we extract a portion of the whole content? I mean a snippet of
the content containing that search keyword. Sample code would be of great
help.

c.) What is multi core indexing?

d.) What is the number of index files that are normally created in a index
operation? What will be the expected number of index files when i index a 4
tera byte of filedata and what will be the index size for all the index
files? If anybody has worked nsuch huge volume of data then some pointers
would be of great help.

Regards
Suryasnat Das


sorl-ruby facet.method

2009-03-05 Thread Ian Connor
Hi,

Is there a way to specify the facet.method using solr-ruby. I tried to add
it like this:

  hash["facet.method"] = @params[:facets][:method] if
@params[:facets][:method]

to line 78 of standard.rb and it works when you add it to the facets Hash.
However, if there is another place that I could set that would be great.

I am also happy to submit a patch on a ticket if that works.

-- 
Regards,

Ian Connor


Re: Solr and Zend Lucene

2009-03-05 Thread revas
We will be using sqllite for db.This can be used for a cd version  where we
need to provide search


On 3/5/09, Grant Ingersoll  wrote:
>
>
> On Mar 5, 2009, at 3:10 AM, revas wrote:
>
> Hi,
>>
>> I have a requirement where i need to search offline.We are thinking of
>> doing
>> this by storing the index terms in a db .
>>
>
> I'm not sure I follow.  How is it that Solr would be offline, but your DB
> would be online?  Can you explain a bit more the problem you are trying to
> solve?
>
>
>
>>
>> Is there a was of accessing the index tokens in solr 1.3 ?
>>
>
> Not in 1.3, but trunk does.  Have a look at the TermsComponent (
> http://wiki.apache.org/solr/TermsComponent).  I suppose if you got things
> in a JSON or binary format, the performance might not be horrible, but it
> will depend on the # of terms in the index.  Or, you could get things in
> stages, i.e. all terms between a and b, etc.  It might be back compatible
> with 1.3, but I don't know for sure.
>
>
> -Grant
>


Re: DataImportHandler and delta-import question

2009-03-05 Thread Garafola Timothy
yes, the dataimport.properties file is present in the conf directory
from previous imports.  I'll try the trunk version as you suggested to
see if the problem persists.

Thanks,
Tim

On Wed, Mar 4, 2009 at 7:54 PM, Noble Paul നോബിള്‍  नोब्ळ्
 wrote:
> the dataimport.properties is created only after one successful import
> .so it is available only from second import onwards. probably you can
> create one manually and put it in the conf dir.
>
> On Thu, Mar 5, 2009 at 12:52 AM, Garafola Timothy  
> wrote:
>> Thanks,
>>
>> I set up a another test instance of solr and ran a full import within
>> the DIH Development Console.  I examined the query and found that
>> last_index_time is not getting set in the query.  Yet the value does
>> get updated after a full import completes (outside of the development
>> console).  Is there some place that I need to set the path to the
>> dataimport.properties file?
>>
>> On Tue, Mar 3, 2009 at 8:03 PM, Noble Paul നോബിള്‍  नोब्ळ्
>>  wrote:
>>> I do not see anything wrong with this .It should have worked . Can you
>>> check that dataimport.properties is created (by DIH) in the conf
>>> directory? . check the content?
>>>
>>>
>>> are you sure that the query
>>>
>>> select DId from 2_Doc where ModifiedDate > '${dataimporter.last_index_time}'
>>>
>>> works with  a date format -MM-dd HH:mm:ss . This is the format
>>> which DIH sends the date in . If the format is wrong you may need to
>>> format it using a dateformat function.
>>>
>>> see here
>>>
>>> http://wiki.apache.org/solr/DataImportHandler#head-5675e913396a42eb7c6c5d3c894ada5dadbb62d7
>>>
>>>
>>>  The trunk DIH can work with Solr1.3 (you may need to put the DIH jar
>>> and slf4j). Can
>>> - Show quoted text -
>>> On Wed, Mar 4, 2009 at 3:53 AM, Garafola Timothy  
>>> wrote:
 I'm using solr 1.3 and am trying to get a delta-import with the DIH.
 Recently the wiki, http://wiki.apache.org/solr/DataImportHandler, was
 updated explaining that delta import is a 1.4 feature now but it was
 still possible get a delta using the full import example here,
 http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta.  I
 tried this but each time I run DIH, it reimports all rows and updates.

 Below is my data-config.xml.  I set rootEntity to false and issued
 command=full-import&clean=false&optimize=false through DIH.  Am I
 doing something wrong here or is the DataImportHandlerFaq incorrect?

 
        >>> url="jdbc:mysql://pencil-somewhere.com:2/SomeDB" user="someUser"
  password="somePassword"/>
        
                >>>                        query = "select DId from 2_Doc where
 ModifiedDate > '${dataimporter.last_index_time}'
                                      and DocType != 'Research Articles'">
                        >>> transformer="RegexTransformer"
                                query = "SELECT d.DId, d.SiteId,
 d.DocTitle, d.DocURL, d.DocDesc,
                                        d.DocType, d.Tags, d.Source,
 d.Last90DaysRFIsPercent,
                                        d.ModifiedDate, d.DocGuid, d.Author,
                                        i.Industry FROM 2_Doc d LEFT
 OUTER JOIN tmp_DocIndustry i
                                        ON (d.DocId=i.DocId AND
 d.SiteId=i.SiteId) where d.DocType != 'Research articles'
                                        and d.DId = '${item.DId}' and
 d.ModifiedDate > '${dataimporter.last_index_time}'">
                                
                                
                                
                                >>> ="DocTitle"/>
                                
                                
                                >>> regex="^(.{0,800})\b.*$" sourceColName="DocDesc"/>
                                >>> ="DocType"/>
                                >>> splitBy=";" sourceColName="Tags"/>
                                
                                >>> "Last90DaysRFIsPercent"   name ="Last90DaysRFIsPercent"/>
                                >>> ="ModifiedDate"/>
                                >>> ="DocGuid"/>
                                
                                >>> ="Industry" sourceColName="Industry"/>
                        
                
        
 

 Thanks,
 -Tim

>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>
>>
>>
>> --
>> -Tim
>>
>
>
>
> --
> --Noble Paul
>



-- 
-Tim


Re: Query just for documents with full matching

2009-03-05 Thread Otis Gospodnetic

Hi,

Index titles as "string" type.  But this will completely prevent you from being 
able to match "stop watch" when you search for "stop".  This is where field 
copying can help.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: stockm 
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 5, 2009 11:39:26 AM
> Subject: Query just for documents with full matching
> 
> 
> Hello,
> 
> for example there are some documents:
> 
> Doc1 with title "Stop"
> Doc2 with title "Stop watch"
> Doc3 with title "Stop"
> Doc4 with title "Watch"
> Doc5 with title "Watch Stop"
> Doc6 with title "Watch"
> 
> If I search for:  title:"Stop"
> I will get Doc1, Doc3, Doc2, Doc5 (Doc2 and Doc5 with a lower score than
> Doc1, Doc3).
> 
> My question: Is there a way/query to tell solr that I just want to have the
> Docs with 100% identical titles (Doc1, Doc3)?
> 
> Thanks in advance,
> stockm
> -- 
> View this message in context: 
> http://www.nabble.com/Query-just-for-documents-with-full-matching-tp22355359p22355359.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit / new searcher delay?

2009-03-05 Thread Steve Conover
Yep, I notice the default is true/true, but I explicitly specified
both those things too and there's no difference in behavior.

On Wed, Mar 4, 2009 at 7:39 PM, Shalin Shekhar Mangar
 wrote:
> On Thu, Mar 5, 2009 at 6:06 AM, Steve Conover  wrote:
>
>> I'm doing some testing of a solr master/slave config and find that,
>> after syncing my slave, I need to sleep for about 400ms after commit
>> to "see" the new index state.  i.e. if I don't sleep, and I execute a
>> query, I get results that reflect the prior state of the index.
>>
>
> How are you sending the commit? You should use commit with waitSearcher=true
> and waitFlush=true so that it blocks until the new searcher becomes
> available for querying.
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


RE: How to search the database tables using solr.

2009-03-05 Thread Radha C.
Shalin,
 
I  did not run the examples because in the demo everything is already
configured and built in, so the demo will run properly. 
So I am not clear about how the example is working and what is the
configuration done for that, Where to start if I write a small new database
search. 
I want to understand fully what is configured and how to configure , So I
tried to index my local MySql DB directly with one simple table called
persons in it.
But I am getting lot of errors. the following are the steps I did,
1.   downloaded solr-2009-03-03.zip

distribution and extracted to d:/solrtemp/
2.   copied the example/solr to d:/solr ( this is my solr home ) for my
application template.
3.   set this solr home insidecatalina-home/localconfig/solr.xml
4.   create the data-config under $solrhome/config
here is my dataconfig file










 
 I am not sure what needs to be done next. I am getting error when I start
tomcat.
 
Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImportHandler
processConfiguration
INFO: Processing configuration from solrconfig.xml: {config=data-config.xml}
Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImporter
loadDataConfig
INFO: Data Configuration loaded successfully
Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImporter
verifyWithSchema
INFO: id is a required field in SolrSchema . But not found in DataConfig
Mar 5, 2009 9:10:28 PM org.apache.solr.handler.dataimport.DataImportHandler
inform
SEVERE: Exception while loading DataImporter
org.apache.solr.handler.dataimport.DataImportHandlerException: There are
errors in the Schema
The field :age present in DataConfig does not have a counterpart in Solr
Schema
The field :firstname present in DataConfig does not have a counterpart in
Solr Schema
The field :personid present in DataConfig does not have a counterpart in
Solr Schema
The field :lastName present in DataConfig does not have a counterpart in
Solr Schema
at
org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:108
)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
r.java:95)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at
org.apache.solr.core.SolrCore.(SolrCore.java:571).
.  
Mar 5, 2009 9:10:28 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
org.apache.solr.common.SolrException: FATAL: Could not create importer.
DataImporter config invalid
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
r.java:103)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)

at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
There are errors in the Schema
The field :age present in DataConfig does not have a counterpart in Solr
Schema
The field :firstname present in DataConfig does not have a counterpart in
Solr Schema
The field :personid present in DataConfig does not have a counterpart in
Solr Schema
The field :lastName present in DataConfig does not have a counterpart in
Solr Schema
at
org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:108
)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandle
r.java:95)
... 31 more
Mar 5, 2009 9:10:28 PM org.apache.solr.core.QuerySenderListener newSearcher
 
I want to write a small seperate application without relying on examples,
can you provide me some useful steps how to do that.
 
Thanks

  _  

From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Thursday, March 05, 2009 5:16 PM
To: solr-user@lucene.apache.org; cra...@ceiindia.com
Subject: Re: How to search the database tables using solr.


On Thu, Mar 5, 2009 at 4:42 PM, Radha C.  wrote:



Hi,

I am newbie for solr search engin. I don't find any juicy information on how
to configure the ORACLE data base to index the tables using solr search
engin. There are huge documents spread over wiki pages. I need some core
information.
I am using Apache-tomcat 5.5.26, and oracle 9i. Can you please provide me a
brief details for installing and configuring the database with solr and how
to indexing the rows. Your help will really save lot of my time.




Have you been able to run the example/example-DIH demo? If yes, using Oracle
database instead of the HSQLDB used in the example is easy. Substitute the
driver name, username, password with the appropriate values for your
database. Make sure you add the oracle driver's jar file to $solr_home/lib.

-- 
Regards,
Shalin Shekhar Mangar.



Re: Solr and Zend Lucene

2009-03-05 Thread Grant Ingersoll


On Mar 5, 2009, at 3:10 AM, revas wrote:


Hi,

I have a requirement where i need to search offline.We are thinking  
of doing

this by storing the index terms in a db .


I'm not sure I follow.  How is it that Solr would be offline, but your  
DB would be online?  Can you explain a bit more the problem you are  
trying to solve?






Is there a was of accessing the index tokens in solr 1.3 ?


Not in 1.3, but trunk does.  Have a look at the TermsComponent (http://wiki.apache.org/solr/TermsComponent 
).  I suppose if you got things in a JSON or binary format, the  
performance might not be horrible, but it will depend on the # of  
terms in the index.  Or, you could get things in stages, i.e. all  
terms between a and b, etc.  It might be back compatible with 1.3, but  
I don't know for sure.



-Grant


Re: Custom Field Type

2009-03-05 Thread Yonik Seeley
On Thu, Mar 5, 2009 at 4:50 AM, Fouad Mardini  wrote:
> Thanks for your help, but I am not really sure I follow.
> It is possible to use the PatternTokenizerFactory with pattern = (\d+)  and
> group = 0 to tokenize the input correctly
> But I don't see how to use the copyField to achieve sorting
>
>  />
> I read the documentation and this does not seem to be possible

copyField myfield -> myfield2
the field type for myfield would keep the first number
the field type for myfield2 would keep the second number

But actually, something like that only works for text field types that
you can specify an analyzer for.  To sort by the integer value, you
need an integer field.

So is there a way for your indexing code to split the numbers before
they are sent to Solr?

> Are there any performance implications on using dynamic fields?
> Could you please elaborate on your idea

Very little to none in most situations.

-Yonik
http://www.lucidimagination.com


Re: Column Query with q query Parameter

2009-03-05 Thread Erik Hatcher


On Mar 5, 2009, at 8:31 AM, dabboo wrote:
I am implementing column specific search with q query parameter. I  
have

achieved the same but field boosting is not working in that.

Below is the query which is getting formed for this URL:

/?q=productURL_s:amit%20OR 
%20prdMainTitle_s:amitg 
&version=2.2&start=0&rows=10&indent=on&qt=dismaxrequest


Query:

productURL_s:amit prdMainTitle_s:amitg

It is fetching the records, which mataches this criteria but it doesnt
honour the field boosting.
Can someone please tell me what query should be formed in order to  
get field

boosting running.



Amit - it seems we keep revisiting the same question about q/q.alt/ 
dismax.


Please provide complete details - what does your dismaxrequest handler  
config look like?  What does debugQuery=true parsed query output say?


If you're using the dismax parser, your field selections aren't going  
to work, nor is the OR syntax.   For details, look at the dismax and  
solr query parser pages on the Solr wiki and that'll hopefully clarify  
some things.


Erik



Re: change the lucene version

2009-03-05 Thread Marc Sturlese

You have to download the source, replace the lucene .jars and recompile it. 
But be careful, I tried to run a recent solr nightly build with the oficial
lucene release 2.4 and got errors in compilation. It was due to some new
features of the IndexDeletionPolicies that are just available in lucene
2.9-dev. Anyway, haven't experienced bugs at the moment using 2.9-dev...

revas wrote:
> 
> Hi,
> 
> If i need to change the lucene version of solr ,then how can we do this?
> 
> Regards
> Revas
> 
> 

-- 
View this message in context: 
http://www.nabble.com/change-the-lucene-version-tp22347213p22354056.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: reporting requirement

2009-03-05 Thread Walter Underwood
This sounds like something that should be done with SQL on
a relational database. --wunder

On 3/5/09 1:41 AM, "Ron Chan"  wrote:

> Hi 
> 
> I'm looking to build summary reports, something like
> 
>jan feb mar total
> branch A 
> branch B 
> branch C 
> 
> should I search for the raw data and build the table at the client end?
> 
> or is this better done inside a custom search component?
> 
> thanks
> Ron



Re: public apology for company spam

2009-03-05 Thread Glen Newton
Yonik,

Thank-you for your email. I appreciated and accept your apology.

Indeed the spam was annoying, but I think that you and your colleagues
have significant social capital in the Lucene and Solr communities, so
this minor but unfortunate incident should have minimal impact.

That said, you and your colleagues do not have infinite social
capital, and hopefully you will have no  reason to be forced to spend
this capital in such an unfortunate manner in the future.  :-)

sincerely,

Glen Newton

2009/3/5 Yonik Seeley :
> This morning, an apparently over-zealous marketing firm, on behalf of
> the company I work for, sent out a marketing email to a large number
> of subscribers of the Lucene email lists.  This was done without my
> knowledge or approval, and I can assure you that I'll make all efforts
> to prevent it from happening again.
>
> Sincerest apologies,
> -Yonik
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>



-- 

-


public apology for company spam

2009-03-05 Thread Yonik Seeley
This morning, an apparently over-zealous marketing firm, on behalf of
the company I work for, sent out a marketing email to a large number
of subscribers of the Lucene email lists.  This was done without my
knowledge or approval, and I can assure you that I'll make all efforts
to prevent it from happening again.

Sincerest apologies,
-Yonik


Column Query with q query Parameter

2009-03-05 Thread dabboo

Hi,

I am implementing column specific search with q query parameter. I have
achieved the same but field boosting is not working in that.

Below is the query which is getting formed for this URL:

/?q=productURL_s:amit%20OR%20prdMainTitle_s:amitg&version=2.2&start=0&rows=10&indent=on&qt=dismaxrequest

Query:

productURL_s:amit prdMainTitle_s:amitg

It is fetching the records, which mataches this criteria but it doesnt
honour the field boosting. 
Can someone please tell me what query should be formed in order to get field
boosting running.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Column-Query-with-q-query-Parameter-tp22351678p22351678.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: reporting requirement

2009-03-05 Thread Erick Erickson
I'm really puzzled about *what* you're reporting. Could you add
some detail?

Best
Erick

On Thu, Mar 5, 2009 at 4:41 AM, Ron Chan  wrote:

> Hi
>
> I'm looking to build summary reports, something like
>
>   jan feb mar total
> branch A
> branch B
> branch C
>
> should I search for the raw data and build the table at the client end?
>
> or is this better done inside a custom search component?
>
> thanks
> Ron
>


Re: How to search the database tables using solr.

2009-03-05 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 4:42 PM, Radha C.  wrote:

>
> Hi,
>
> I am newbie for solr search engin. I don't find any juicy information on
> how
> to configure the ORACLE data base to index the tables using solr search
> engin. There are huge documents spread over wiki pages. I need some core
> information.
> I am using Apache-tomcat 5.5.26, and oracle 9i. Can you please provide me a
> brief details for installing and configuring the database with solr and how
> to indexing the rows. Your help will really save lot of my time.
>
>
Have you been able to run the example/example-DIH demo? If yes, using Oracle
database instead of the HSQLDB used in the example is easy. Substitute the
driver name, username, password with the appropriate values for your
database. Make sure you add the oracle driver's jar file to $solr_home/lib.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Number of webapps

2009-03-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
running multiple webapps look like a bad idea. This is the very reason
solr has the multicore feature.

permgen size is a jvm option I guess it would be something like
 -XX:MaxPermSize

On Wed, Feb 25, 2009 at 1:38 PM, revas  wrote:
> Hi
>
> I am sure this question has been repeated many times over and there has been
> several generic answers ,but i am looking for specific answers.
>
> I have a single server whose configuration i give below,this being the only
> server we have at present ,the requirement is everytime we create a new
> website ,we create two instances for the same one for content search and one
> for product search ,both have faceting requirements.
>
> there are about 25 fields for product schema and abt 20 for content schema
> ,we do not store the content in the server ,the content is only indexed.
>
> Assuming that we currently have 10 websites ,which means we have 20 webapps
> running on this server each having about 1000 documents  and size of the
> index is approximately 50mb currently .The index size of each is expected to
> grow continlously as more products are added.
>
>
> We recenlty got the followng error on creation of a new webapp ?
>    SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen space)
> executing org.apache.tomcat.util.net.leaderfollowerworkerthr...@1c2534f,
> terminating thread
> Feb 24, 2009 6:22:16 AM
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable run
> SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen space)
> executing org.apache.tomcat.util.net.leaderfollowerworkerthr...@1c2534f,
> terminating thread
>  Sent at 12:32 PM on Wednesday
>
>    What would this mean?Given the above,How many such webapps can we have
> on this server?
>
> *Server config*
>
> OS: Red Hat Enterprise Linux ES 4 - 64 Bit
> # Processor: Dual AMD Opteron Dual Core 270 2.0 GHz
> # 4GB DDR RAM
> # Hard Drive: 73GB SCSI
> # Hard Drive: 73GB SCSI
>
> thanks
>



-- 
--Noble Paul


RE: Solr java client using solrj API

2009-03-05 Thread Radha C.

Thanks, my client will send the query to solr and retrive the result from solr, 
so solrj is having this facilities and I can use this API. 
 

-Original Message-
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.p...@gmail.com] 
Sent: Thursday, March 05, 2009 4:55 PM
To: solr-user@lucene.apache.org; cra...@ceiindia.com
Subject: Re: Solr java client using solrj API

if you wish to search the content in your DB , you will need to index it first 
to Solr.
the search can be done on Solr and if you are using java , you can use SolrJ 
for that.
--Noble

On Thu, Mar 5, 2009 at 4:29 PM, Radha C.  wrote:
>
> Hi,
>
> We are planning to use solr search server for our database content 
> search, so we have a plan to create our own java client.
> Does solrj API provide the facilities to create java client for our 
> database search? Where I can get the information about the integration 
> of oracle content +solr search server+java client using solrj API. Any 
> help will be appriciated.
>
> Thanks,
>



--
--Noble Paul



Re: Number of webapps

2009-03-05 Thread revas
HI,

How do i get the info on the current setting of MaxPermSize?

Regards
Sujahta

On 2/27/09, Alexander Ramos Jardim  wrote:
>
> Another simple solution for your requirement is to use multicore. This way
> you will have only one Solr webapp loaded with as many indexes as you need.
>
> See more at http://wiki.apache.org/solr/MultiCore
>
> 2009/2/25 Michael Della Bitta 
>
> > Unfortunately, I think the way this works is the container creates a
> > Classloader for each context and loads the contents of the .war into
> > that, regardless of whether each context references the same .war
> > file. All those classes are stored in permanent generation space, and
> > I'm fairly sure if you restart a context individually with the manager
> > application, a new ClassLoader for the context is created and the
> > permanent generation space the old one was consuming is simply leaked.
> >
> > Something that is crazy enough to work might be to unpack the Solr
> > .war and move all the .jar files and class files that don't contain
> > servlet API classes to .jars in $TOMCAT_HOME/lib, and then repack the
> > .war without these files. These would then be loaded by the common
> > classloader once per container, instead of once per context. You can
> > read more about this classloader business here:
> > http://tomcat.apache.org/tomcat-6.0-doc/class-loader-howto.html (might
> > need a different URL depending on the version of Tomcat you're
> > running).
> >
> > Michael
> >
> > On Wed, Feb 25, 2009 at 11:42 AM, revas  wrote:
> > > thanks will try that .I also have the war file for each solr instance
> in
> > the
> > > home directory of the instance ,would that be the problem ?
> > >
> > > if i were to have common war file for n instances ,would there be any
> > issue?
> > >
> > > regards
> > > revas
> > >
> > > On 2/25/09, Michael Della Bitta  wrote:
> > >>
> > >> It's possible I don't know enough about Solr's internals and there's a
> > >> better solution than this, and it's surprising me that you're running
> > >> out of PermGen space before you're running out of heap, but maybe
> > >> you've already increased the general heap size without tweaking
> > >> PermGen, and loading all the classes involved in loading 20 contexts
> > >> is putting you over. In any case, you might try adding the following
> > >> option to CATALINA_OPTS: -XX:MaxPermSize=256m. If you don't know where
> > >> to put something like that, you might try adding the following line to
> > >> $TOMCAT_HOME/bin/startup.sh:
> > >>
> > >> export CATALINA_OPTS="-XX:MaxPermSize=256m ${CATALINA_OPTS}"
> > >>
> > >> If that value (256) doesn't alleviate the problem, you might try
> > increasing
> > >> it.
> > >>
> > >> Hope that helps,
> > >>
> > >> Michael Della Bitta
> > >>
> > >>
> > >> On Wed, Feb 25, 2009 at 3:08 AM, revas  wrote:
> > >> > Hi
> > >> >
> > >> > I am sure this question has been repeated many times over and there
> > has
> > >> been
> > >> > several generic answers ,but i am looking for specific answers.
> > >> >
> > >> > I have a single server whose configuration i give below,this being
> the
> > >> only
> > >> > server we have at present ,the requirement is everytime we create a
> > new
> > >> > website ,we create two instances for the same one for content search
> > and
> > >> one
> > >> > for product search ,both have faceting requirements.
> > >> >
> > >> > there are about 25 fields for product schema and abt 20 for content
> > >> schema
> > >> > ,we do not store the content in the server ,the content is only
> > indexed.
> > >> >
> > >> > Assuming that we currently have 10 websites ,which means we have 20
> > >> webapps
> > >> > running on this server each having about 1000 documents  and size of
> > the
> > >> > index is approximately 50mb currently .The index size of each is
> > expected
> > >> to
> > >> > grow continlously as more products are added.
> > >> >
> > >> >
> > >> > We recenlty got the followng error on creation of a new webapp ?
> > >> >SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen
> > space)
> > >> > executing
> > org.apache.tomcat.util.net.leaderfollowerworkerthr...@1c2534f,
> > >> > terminating thread
> > >> > Feb 24, 2009 6:22:16 AM
> > >> > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable run
> > >> > SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen space)
> > >> > executing
> > org.apache.tomcat.util.net.leaderfollowerworkerthr...@1c2534f,
> > >> > terminating thread
> > >> >  Sent at 12:32 PM on Wednesday
> > >> >
> > >> >What would this mean?Given the above,How many such webapps can we
> > have
> > >> > on this server?
> > >> >
> > >> > *Server config*
> > >> >
> > >> > OS: Red Hat Enterprise Linux ES 4 - 64 Bit
> > >> > # Processor: Dual AMD Opteron Dual Core 270 2.0 GHz
> > >> > # 4GB DDR RAM
> > >> > # Hard Drive: 73GB SCSI
> > >> > # Hard Drive: 73GB SCSI
> > >> >
> > >> > thanks
> > >> >
> > >>
> > >
> >
>
>
>
> --
> Alexander Ramos Jardim
>


Re: Solr java client using solrj API

2009-03-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
if you wish to search the content in your DB , you will need to index
it first to Solr.
the search can be done on Solr and if you are using java , you can use
SolrJ for that.
--Noble

On Thu, Mar 5, 2009 at 4:29 PM, Radha C.  wrote:
>
> Hi,
>
> We are planning to use solr search server for our database content search,
> so we have a plan to create our own java client.
> Does solrj API provide the facilities to create java client for our database
> search? Where I can get the information about the integration of oracle
> content +solr search server+java client using solrj API. Any help will be
> appriciated.
>
> Thanks,
>



-- 
--Noble Paul


RE: How to search the database tables using solr.

2009-03-05 Thread Radha C.

Hi,

I am newbie for solr search engin. I don't find any juicy information on how
to configure the ORACLE data base to index the tables using solr search
engin. There are huge documents spread over wiki pages. I need some core
information. 
I am using Apache-tomcat 5.5.26, and oracle 9i. Can you please provide me a
brief details for installing and configuring the database with solr and how
to indexing the rows. Your help will really save lot of my time. 

-Original Message-
From: Venu Mittal [mailto:metale...@yahoo.com] 
Sent: Thursday, March 05, 2009 1:14 PM
To: solr-user@lucene.apache.org
Subject: Re: How to search the database tables using solr.

Does anybody has any stats to share on how much time does DataImportHandler
takes to index a given set of data ?

I am currently indexing 18 millions rows in 1.5 - 2 hours by sending xmls to
solr. 




From: Shalin Shekhar Mangar 
To: solr-user@lucene.apache.org; cra...@ceiindia.com
Sent: Wednesday, March 4, 2009 8:15:07 AM
Subject: Re: How to search the database tables using solr.

On Wed, Mar 4, 2009 at 7:51 PM, Radha C.  wrote:

> Thanks Shalin,
>
> We just stepped on solr. This information is very much useful for me. 
> But before that I want some clear details about where to start..
> I want to test this in my local environment, so I need some basic 
> information about how to start using this ( database and solr ). Do 
> you have some information on this?
>

I think the easiest way is to start using Solr is with the embedded jetty
container. Modify the example/conf/schema.xml file and add your own fields
etc. Read through the DataImportHandler wiki page and at the
example/example-DIH directory in the solr zip/tarball.

If you have a specific doubt/question, ask on the list.

--
Regards,
Shalin Shekhar Mangar.



  



Solr java client using solrj API

2009-03-05 Thread Radha C.
 
Hi,
 
We are planning to use solr search server for our database content search,
so we have a plan to create our own java client.
Does solrj API provide the facilities to create java client for our database
search? Where I can get the information about the integration of oracle
content +solr search server+java client using solrj API. Any help will be
appriciated.
 
Thanks,


Re: How to search the database tables using solr.

2009-03-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
it depends on how fast your DB can give out data through jdbc . the
best thing is to just run and see.
--Noble

On Thu, Mar 5, 2009 at 1:13 PM, Venu Mittal  wrote:
> Does anybody has any stats to share on how much time does DataImportHandler 
> takes to index a given set of data ?
>
> I am currently indexing 18 millions rows in 1.5 - 2 hours by sending xmls to 
> solr.
>
>
>
> 
> From: Shalin Shekhar Mangar 
> To: solr-user@lucene.apache.org; cra...@ceiindia.com
> Sent: Wednesday, March 4, 2009 8:15:07 AM
> Subject: Re: How to search the database tables using solr.
>
> On Wed, Mar 4, 2009 at 7:51 PM, Radha C.  wrote:
>
>> Thanks Shalin,
>>
>> We just stepped on solr. This information is very much useful for me. But
>> before that I want some clear details about where to start..
>> I want to test this in my local environment, so I need some basic
>> information about how to start using this ( database and solr ). Do you
>> have
>> some information on this?
>>
>
> I think the easiest way is to start using Solr is with the embedded jetty
> container. Modify the example/conf/schema.xml file and add your own fields
> etc. Read through the DataImportHandler wiki page and at the
> example/example-DIH directory in the solr zip/tarball.
>
> If you have a specific doubt/question, ask on the list.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
>
>
>



-- 
--Noble Paul


Admin stats using SolrJ

2009-03-05 Thread nikhil500

Hi,

I use Solr 1.3 through SolrJ.  I want to access the statistics which are
displayed at /admin/ in the default Solr install.  Is there some way of
getting those statistics through SolrJ.  I tried
query.setQueryType("admin"); in code and renamed the "/admin/"
requesthandler in solrconfig.xml to just "admin" but it does not seem to
work.  (Says " org.apache.solr.common.SolrException: The AdminHandler needs
to be registered to a path.  Typically this is '/admin' ")

Any tips?

Thanks,
Nikhil

-- 
View this message in context: 
http://www.nabble.com/Admin-stats-using-SolrJ-tp22348609p22348609.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Custom Field Type

2009-03-05 Thread Fouad Mardini
Hello Yonik,

Thanks for your help, but I am not really sure I follow.
It is possible to use the PatternTokenizerFactory with pattern = (\d+)  and
group = 0 to tokenize the input correctly
But I don't see how to use the copyField to achieve sorting


I read the documentation and this does not seem to be possible

Are there any performance implications on using dynamic fields?
Could you please elaborate on your idea

Thanks again
/Fouad


On Wed, Mar 4, 2009 at 8:12 PM, Yonik Seeley wrote:

> On Wed, Mar 4, 2009 at 12:24 PM, Fouad Mardini 
> wrote:
> > I have a multivalued field in my schema of type text_ws, values are of
> the
> > form #int #int
> > I need to be able to query on the first and sort on the second, this does
> > not seem to be enabled out of the box
>
> Can you put the two numbers in separate fields for this purpose?
> If you can't do it from the indexer, a schema with copyField in
> conjunction with PatternTokenizerFactory could do it.
>
> -Yonik
> http://www.lucidimagination.com
>


reporting requirement

2009-03-05 Thread Ron Chan
Hi 

I'm looking to build summary reports, something like 

   jan feb mar total
branch A 
branch B 
branch C 

should I search for the raw data and build the table at the client end?

or is this better done inside a custom search component?

thanks
Ron


Re: Column Specific Query with q parameter

2009-03-05 Thread dabboo

Hi,

Can somebody please give me an example as how to achieve it.

Thanks,
Amit Garg

dabboo wrote:
> 
> Hi,
> 
> I am implementing column specific query with q query parameter. for e.g.
> 
> ?q=prdMainTitle_product_s:math & qt=dismaxrequest
> 
> The above query doesnt work while if I use the same query with q.alt
> parameter, it works.
> 
> ?q=&q.alt= prdMainTitle_product_s:math & qt=dismaxrequest
> 
> Please suggest, how to achieve this with q query.
> 
> Thanks,
> Amit Garg
> 

-- 
View this message in context: 
http://www.nabble.com/Column-Specific-Query-with-q-parameter-tp22345960p22348016.html
Sent from the Solr - User mailing list archive at Nabble.com.



change the lucene version

2009-03-05 Thread revas
Hi,

If i need to change the lucene version of solr ,then how can we do this?

Regards
Revas


Solr and Zend Lucene

2009-03-05 Thread revas
Hi,

I have a requirement where i need to search offline.We are thinking of doing
this by storing the index terms in a db .

Is there a was of accessing the index tokens in solr 1.3 ?

The other way is to use Zend_lucene to read the index file of solr as zend
lucene has method for doing this.But Zend lucene is not able to
open the solr index files ,the error being unsupported format.

The final option is to reindex using zend lucene and read the index tokens
,but then facets are not supported by zend-lucene

Any body done something similar,please give your thoughts or pointers

Regards
Revas