date:20090512

Re: Who is running 1.4 nightly in production?

2009-05-12 Thread Jaco

Running 1.4 nightly in production as well, also for the Java replication and
for the improved facet count algorithms. No problems, all running smoothly.

Bye,

Jaco.

2009/5/13 Erik Hatcher 

> We run a not too distant trunk (1.4, probably a month or so ago) version of
> Solr on LucidFind at http://www.lucidimagination.com/search
>
>Erik
>
> On May 12, 2009, at 5:02 PM, Walter Underwood wrote:
>
>  We're planning our move to 1.4, and want to run one of our production
>> servers with the new code. Just to feel better about it, is anyone else
>> running 1.4 in production?
>>
>> I'm building 2009-05-11 right now.
>>
>> wuner
>>
>
>

master/slave failure scenario

2009-05-12 Thread nk 11

Hello

I'm kind of new to Solr and I've read about replication, and the fact that a
node can act as both master and slave.
I a replica fails and then comes back on line I suppose that it will resyncs
with the master.

But what happnes if the master fails? A slave that is configured as master
will kick in? What if that slave is not yes fully sync'ed with the failed
master and has old data?

What happens when the original master comes back on line? He will remain a
slave because there is another node with the master role?

Thank you!

RE: Solr Loggin issue

2009-05-12 Thread Sagar Khetkade


 

I have only one log4j.properties file in classpath and even if i configure for 
the particular package where the solr exception would come then also the same 
issue. I had removed the logger for my application and using only for solr 
logging.

 

~Sagar

 


 
> Date: Tue, 12 May 2009 09:59:01 -0700
> Subject: Re: Solr Loggin issue
> From: jayallenh...@gmail.com
> To: solr-user@lucene.apache.org
> 
> Usually that means there is another log4j.properties or log4j.xml file in
> your classpath that is being found before the one you are intending to use.
> Check your classpath for other versions of these files.
> 
> -Jay
> 
> 
> On Tue, May 12, 2009 at 3:38 AM, Sagar Khetkade
> wrote:
> 
> >
> > Hi,
> > I have solr implemented in multi-core scenario and also implemented
> > solr-560-slf4j.patch for implementing the logging. But the problem I am
> > facing is that the logs are going to the stdout.log file not the log file
> > that I have mentioned in the log4j.properties file. Can anybody give me work
> > round to make logs go into the logger mentioned in log4j.properties file.
> > Thanks in advance.
> >
> > Regards,
> > Sagar Khetkade
> > _
> > Live Search extreme As India feels the heat of poll season, get all the
> > info you need on the MSN News Aggregator
> > http://news.in.msn.com/National/indiaelections2009/aggregator/default.aspx
> >

_
Live Search extreme As India feels the heat of poll season, get all the info 
you need on the MSN News Aggregator
http://news.in.msn.com/National/indiaelections2009/aggregator/default.aspx

Re: Replication master+slave

2009-05-12 Thread Jian Han Guo

I was looking at the same problem, and had a discussion with Noble. You can
use a hack to achieve what you want, see

https://issues.apache.org/jira/browse/SOLR-1154

Thanks,

Jianhan


On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:

> So how are people managing solrconfig.xml files which are largely the same
> other than differences for replication?
>
> I don't think it's a "good thing" to maintain two copies of the same file
> and I'd like to avoid that.  Maybe enabling the XInclude feature in
> DocumentBuilders would make it possible to modularize configuration files to
> make this possible?
>
>
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)
>
>
> -Bryan
>
>
>
>
>
> On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:
>
>  On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot > >wrote:
>>
>>  For replication in 1.4, the wiki at
>>> http://wiki.apache.org/solr/SolrReplication says that a node can be both
>>> the master and a slave:
>>>
>>> A node can act as both master and slave. In that case both the master and
>>> slave configuration lists need to be present inside the
>>> ReplicationHandler
>>> requestHandler in the solrconfig.xml.
>>>
>>> What does this mean?  Does the core then poll itself for updates?
>>>
>>
>>
>> No. This type of configuration is meant for "repeaters". Suppose there are
>> slaves in multiple data-centers (say data center A and B). There is always
>> a
>> single master (say in A). One of the slaves in B is used as a master for
>> the
>> other slaves in B. Therefore, this one slave in B is both a master as well
>> as the slave.
>>
>>
>>
>>> I'd like to have a single set of configuration files that are shared by
>>> masters and slaves and avoid duplicating configuration details in
>>> multiple
>>> files (one for master and one for slave) to ease management and failover.
>>> Is this possible?
>>>
>>>
>> You wouldn't want the master to be a slave. So I guess you'd need to have
>> a
>> separate file. Also, it needs to be a separate file so that the slave does
>> not become a master when the solrconfig.xml is replicated.
>>
>>
>>
>>> When I attempt to setup a multi server master-slave configuration and
>>> include both master and slave replication configuration options, I into
>>> some
>>> problems.  I'm  running a nightly build from May 7.
>>>
>>>
>> Not sure what happened. Is that the url for this solr (meaning same solr
>> url
>> is master and slave of itself)? If yes, that is not a valid configuration.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>

RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon

In reply to both Matt and Jay's comments, the particular situation I'm
dealing with is one where rights will change relatively little once
they are established.  Typically a document will be loaded and
indexed, and a decision will be made on sharing that more-or-less
immediately.  It might change a couple of times after that, but that
will be it.  So early-binding seems like the better option.  Thanks to
both of you for your suggestions and help.

Terence

PS. I wish I had known about that conference...looks like it would
have been very helpful to me right now!

-Original Message-
From: Matt Weber [mailto:m...@mattweber.org]
Sent: May 12, 2009 14:41
To: solr-user@lucene.apache.org
Subject: Re: Selective Searches Based on User Identity



Here is a good presentation on search security from the Infonortics

Search Conference that was held a few weeks ago.



http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf



The approach you are using is called early-binding.  As Jay mentioned,

one of the downsides is updating the documents each time you have an

ACL change.  You could use the late-binding approach that checks each

result after the query but before you display to the user.  I don't

recommend this approach because it will strain your security

infrastructure because you will need to check if the user can access

each result.



Good luck.



Thanks,



Matt Weber

eSr Technologies

http://www.esr-technologies.com

Re: how to manually add data to indexes generated by nutch-1.0 using solr

2009-05-12 Thread Erik Hatcher

send a  request afterwards, or you can add ?commit=true to  
the /update request with the adds.


Erik

On May 12, 2009, at 8:57 PM, alx...@aim.com wrote:



Tried to add a new record using



curl http://localhost:8983/solr/update -H "Content-Type: text/xml" -- 
data-binary '


20090512170318
86937aaee8e748ac3007ed8b66477624
0.21189615
test.com
test test
 20090513003210909
 '

I get



0name="QTime">71




and added records are not found in the search.

Any ideas what went wrong?


Thanks.
Alex.




-Original Message-
From: alx...@aim.com
To: solr-user@lucene.apache.org
Sent: Mon, 11 May 2009 12:14 pm
Subject: how to manually add data to indexes generated by nutch-1.0  
using solr











Hello,

I had? Nutch -1.0 to crawl fetch and index a lot of files. Then I  
needed to?


index a few files also. But I know keywords for those files and their?
locations. I need to add them manually. I took a look to two  
tutorials on the

wiki, but did not find any info about this issue.
Is there a tutorial on, step by step procedure of adding data to?  
nutch index

using solr? manually?

Thanks in advance.
Alex.

Re: Who is running 1.4 nightly in production?

2009-05-12 Thread Erik Hatcher

We run a not too distant trunk (1.4, probably a month or so ago)  
version of Solr on LucidFind at http://www.lucidimagination.com/search


Erik

On May 12, 2009, at 5:02 PM, Walter Underwood wrote:


We're planning our move to 1.4, and want to run one of our production
servers with the new code. Just to feel better about it, is anyone  
else

running 1.4 in production?

I'm building 2009-05-11 right now.

wuner

Re: how to manually add data to indexes generated by nutch-1.0 using solr

2009-05-12 Thread alxsss


 Tried to add a new record using



 curl http://localhost:8983/solr/update -H "Content-Type: text/xml" 
--data-binary '

20090512170318
86937aaee8e748ac3007ed8b66477624
0.21189615
test.com
test test
 20090513003210909
 '

I get



071



and added records are not found in the search.

Any ideas what went wrong?


Thanks.
Alex.


 

-Original Message-
From: alx...@aim.com
To: solr-user@lucene.apache.org
Sent: Mon, 11 May 2009 12:14 pm
Subject: how to manually add data to indexes generated by nutch-1.0 using solr










Hello,

I had? Nutch -1.0 to crawl fetch and index a lot of files. Then I needed to?

index a few files also. But I know keywords for those files and their?
locations. I need to add them manually. I took a look to two tutorials on the 
wiki, but did not find any info about this issue.
Is there a tutorial on, step by step procedure of adding data to? nutch index 
using solr? manually?

Thanks in advance.
Alex.

Re: Replication master+slave

2009-05-12 Thread Bryan Talbot

So how are people managing solrconfig.xml files which are largely the
same other than differences for replication?

I don't think it's a "good thing" to maintain two copies of the same
file and I'd like to avoid that. Maybe enabling the XInclude feature
in DocumentBuilders would make it possible to modularize configuration
files to make this possible?

http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)

-Bryan

On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:

On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot
wrote:

For replication in 1.4, the wiki at
http://wiki.apache.org/solr/SolrReplication says that a node can be
both

the master and a slave:

A node can act as both master and slave. In that case both the
master and
slave configuration lists need to be present inside the
ReplicationHandler

requestHandler in the solrconfig.xml.

What does this mean? Does the core then poll itself for updates?

No. This type of configuration is meant for "repeaters". Suppose
there are
slaves in multiple data-centers (say data center A and B). There is
always a
single master (say in A). One of the slaves in B is used as a master
for the
other slaves in B. Therefore, this one slave in B is both a master
as well

as the slave.

I'd like to have a single set of configuration files that are
shared by
masters and slaves and avoid duplicating configuration details in
multiple
files (one for master and one for slave) to ease management and
failover.

Is this possible?

You wouldn't want the master to be a slave. So I guess you'd need to
have a
separate file. Also, it needs to be a separate file so that the
slave does

not become a master when the solrconfig.xml is replicated.

When I attempt to setup a multi server master-slave configuration and
include both master and slave replication configuration options, I
into some

problems. I'm running a nightly build from May 7.

Not sure what happened. Is that the url for this solr (meaning same
solr url
is master and slave of itself)? If yes, that is not a valid
configuration.

--
Regards,
Shalin Shekhar Mangar.

camel-casing and dismax troubles

2009-05-12 Thread Geoffrey Young

hi all :)

I'm having trouble with camel-cased query strings and the dismax handler.

a user query

 LeAnn Rimes

isn't matching the indexed term

 Leann Rimes

even though both are lower-cased in the end.  furthermore, the
analysis tool shows a match.

the debug query looks like

 "parsedquery":"+((DisjunctionMaxQuery((search-en:\"(leann le)
ann\")) DisjunctionMaxQuery((search-en:rimes)))~2) ()",

I have a feeling it's due to how the broken up tokens are added back
into the token stream with PreserveOriginal, and some strange
interaction between that order and dismax, but I'm not entirely sure.

configs follow.  thoughts appreciated.

--Geoff

Re: Who is running 1.4 nightly in production?

2009-05-12 Thread Matthew Runo

We're using 1.4-dev 749558:749756M that we built on 2009-03-03  
13:10:05 for our master/slave production environment using the Java  
Replication code.


Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833

On May 12, 2009, at 2:02 PM, Walter Underwood wrote:


We're planning our move to 1.4, and want to run one of our production
servers with the new code. Just to feel better about it, is anyone  
else

running 1.4 in production?

I'm building 2009-05-11 right now.

wuner

Who is running 1.4 nightly in production?

2009-05-12 Thread Walter Underwood

We're planning our move to 1.4, and want to run one of our production
servers with the new code. Just to feel better about it, is anyone else
running 1.4 in production?

I'm building 2009-05-11 right now.

wuner

Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber

Here is a good presentation on search security from the Infonortics  
Search Conference that was held a few weeks ago.


http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf

The approach you are using is called early-binding.  As Jay mentioned,  
one of the downsides is updating the documents each time you have an  
ACL change.  You could use the late-binding approach that checks each  
result after the query but before you display to the user.  I don't  
recommend this approach because it will strain your security  
infrastructure because you will need to check if the user can access  
each result.


Good luck.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 1:21 PM, Jay Hill wrote:

The only downside would be that you would have to update a document  
anytime
a user was granted or denied access. You would have to query before  
the
update to get the current values for grantedUID and deniedUID,  
remove/add
values, and update the index. If you don't have a lot of changes in  
the
system that wouldn't be a big deal, but if a lot of changes are  
happening

throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber   
wrote:


I also work with the FAST Enterprise Search engine and this is  
exactly how
their Security Access Module works.  They actually use a modified  
base-32
encoded value for indexing, but that is because they don't have the  
luxury

of untokenized/un-processed String fields like Solr.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com





On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your

idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)

deniedUid (uid of users specifically denied access to the document).
These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get  
at stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence

RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon

Thanks for the tip.  I went to their website (www.fastsearch.com), and got
as far as the second line, top left 'A Microsoft Subsidiary'...at which
point, hopes of it being another open source solution quickly faded. ;-)
Seriously, though, it looks like an interesting product, but open source is
a mandatory requirement for my particular application.  But the fact they
implemented this functionality would seem to support that it's a valid
requirement, and I'll keep plugging away on it.  Thank you very much for
bringing FAST to my attention...I appreciate it!  Best regards...

Terence



-Original Message-
From: Matt Weber [mailto:m...@mattweber.org]
Sent: May 12, 2009 14:06
To: solr-user@lucene.apache.org
Subject: Re: Selective Searches Based on User Identity



I also work with the FAST Enterprise Search engine and this is exactly

how their Security Access Module works.  They actually use a modified

base-32 encoded value for indexing, but that is because they don't

have the luxury of untokenized/un-processed String fields like Solr.



Thanks,



Matt Weber

eSr Technologies

http://www.esr-technologies.com

Re: Selective Searches Based on User Identity

2009-05-12 Thread Jay Hill

The only downside would be that you would have to update a document anytime
a user was granted or denied access. You would have to query before the
update to get the current values for grantedUID and deniedUID, remove/add
values, and update the index. If you don't have a lot of changes in the
system that wouldn't be a big deal, but if a lot of changes are happening
throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber  wrote:

> I also work with the FAST Enterprise Search engine and this is exactly how
> their Security Access Module works.  They actually use a modified base-32
> encoded value for indexing, but that is because they don't have the luxury
> of untokenized/un-processed String fields like Solr.
>
> Thanks,
>
> Matt Weber
> eSr Technologies
> http://www.esr-technologies.com
>
>
>
>
>
> On May 12, 2009, at 12:26 PM, Terence Gannon wrote:
>
>  Paul -- thanks for the reply, I appreciate it.  That's a very practical
>> approach, and is worth taking a closer look at.  Actually, taking your
>> idea
>> one step further, perhaps three fields; 1) ownerUid (uid of the document's
>> owner) 2) grantedUid (uid of users who have been granted access), and 3)
>> deniedUid (uid of users specifically denied access to the document).
>>  These
>> fields, coupled with some business rules around how they were populated
>> should cover off all possibilities I think.
>>
>> Access to the Solr instance would have to be tightly controlled, but
>> that's
>> something that should be done anyway.  You sure wouldn't want end users
>> preparing their own XML and throwing it at Solr -- it would be pretty easy
>> to figure out how to get around the access/denied fields and get at stuff
>> the owner didn't intend.
>>
>> This approach mimics to some degree what is being done in the operating
>> system, but it's still elegant and provides the level of control required.
>> Anybody else have any thoughts in this regard?  Has anybody implemented
>> anything similar, and if so, how did it work?  Thanks, and best regards...
>>
>> Terence
>>
>
>

Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber

I also work with the FAST Enterprise Search engine and this is exactly  
how their Security Access Module works.  They actually use a modified  
base-32 encoded value for indexing, but that is because they don't  
have the luxury of untokenized/un-processed String fields like Solr.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)
deniedUid (uid of users specifically denied access to the  
document).  These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but  
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get at  
stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence

Re: error when seting queryResultWindowSize to zero

2009-05-12 Thread Yonik Seeley

On Tue, May 12, 2009 at 3:03 PM, Marc Sturlese  wrote:
> I have seen that if I set the value of queryResultWindowSize  to 0 in
> solrconfig.xml solr will return an error of divided by zero.

Seems like a configuration error since requesting that results be
retrieved in 0 size chunks doesn't make a lot of sense.

> Checking the source I have seen it can be fixed in SolrIndexSearcher. At the
> end of the function getDocListC it's coded:
>
>        if (maxDocRequested < queryResultWindowSize) {
>          supersetMaxDoc=queryResultWindowSize;
>        } else {
>          supersetMaxDoc = ((maxDocRequested -1)/queryResultWindowSize +
> 1)*queryResultWindowSize;
>          if (supersetMaxDoc < 0) supersetMaxDoc=maxDocRequested;
>        }
>
> I have sorted it oud doing (just addin parenthesis):
>
>        if (maxDocRequested < queryResultWindowSize) {
>          supersetMaxDoc=queryResultWindowSize;
>        } else {
>          supersetMaxDoc = ((maxDocRequested -1)/(queryResultWindowSize +
> 1))*queryResultWindowSize;
>          if (supersetMaxDoc < 0) supersetMaxDoc=maxDocRequested;
>        }
>
> I have seen this is happening in a recent trunk. Is my fix correct?

The +1 really needs to be after the divide (we're rounding up).

If a fix is needed, I imagine it would be at the time that config
parameter is read... if it's less than or equal to 0, then set it to
1.

-Yonik
http://www.lucidimagination.com

RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon

Paul -- thanks for the reply, I appreciate it.  That's a very practical
approach, and is worth taking a closer look at.  Actually, taking your idea
one step further, perhaps three fields; 1) ownerUid (uid of the document's
owner) 2) grantedUid (uid of users who have been granted access), and 3)
deniedUid (uid of users specifically denied access to the document).  These
fields, coupled with some business rules around how they were populated
should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but that's
something that should be done anyway.  You sure wouldn't want end users
preparing their own XML and throwing it at Solr -- it would be pretty easy
to figure out how to get around the access/denied fields and get at stuff
the owner didn't intend.

This approach mimics to some degree what is being done in the operating
system, but it's still elegant and provides the level of control required.
 Anybody else have any thoughts in this regard?  Has anybody implemented
anything similar, and if so, how did it work?  Thanks, and best regards...

Terence

error when seting queryResultWindowSize to zero

2009-05-12 Thread Marc Sturlese


I have seen that if I set the value of queryResultWindowSize  to 0 in
solrconfig.xml solr will return an error of divided by zero.
Checking the source I have seen it can be fixed in SolrIndexSearcher. At the
end of the function getDocListC it's coded:

if (maxDocRequested < queryResultWindowSize) {
  supersetMaxDoc=queryResultWindowSize;
} else {
  supersetMaxDoc = ((maxDocRequested -1)/queryResultWindowSize +
1)*queryResultWindowSize;
  if (supersetMaxDoc < 0) supersetMaxDoc=maxDocRequested;
}

I have sorted it oud doing (just addin parenthesis):

if (maxDocRequested < queryResultWindowSize) {
  supersetMaxDoc=queryResultWindowSize;
} else {
  supersetMaxDoc = ((maxDocRequested -1)/(queryResultWindowSize +
1))*queryResultWindowSize;
  if (supersetMaxDoc < 0) supersetMaxDoc=maxDocRequested;
}

I have seen this is happening in a recent trunk. Is my fix correct?

-- 
View this message in context: 
http://www.nabble.com/error-when-seting-queryResultWindowSize-to-zero-tp23508478p23508478.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replication master+slave

2009-05-12 Thread Shalin Shekhar Mangar

On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot wrote:

> For replication in 1.4, the wiki at
> http://wiki.apache.org/solr/SolrReplication says that a node can be both
> the master and a slave:
>
> A node can act as both master and slave. In that case both the master and
> slave configuration lists need to be present inside the ReplicationHandler
> requestHandler in the solrconfig.xml.
>
> What does this mean?  Does the core then poll itself for updates?

No. This type of configuration is meant for "repeaters". Suppose there are
slaves in multiple data-centers (say data center A and B). There is always a
single master (say in A). One of the slaves in B is used as a master for the
other slaves in B. Therefore, this one slave in B is both a master as well
as the slave.

>
> I'd like to have a single set of configuration files that are shared by
> masters and slaves and avoid duplicating configuration details in multiple
> files (one for master and one for slave) to ease management and failover.
>  Is this possible?
>

You wouldn't want the master to be a slave. So I guess you'd need to have a
separate file. Also, it needs to be a separate file so that the slave does
not become a master when the solrconfig.xml is replicated.

>
> When I attempt to setup a multi server master-slave configuration and
> include both master and slave replication configuration options, I into some
> problems.  I'm  running a nightly build from May 7.
>

Not sure what happened. Is that the url for this solr (meaning same solr url
is master and slave of itself)? If yes, that is not a valid configuration.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Newbie question

2009-05-12 Thread Shalin Shekhar Mangar

On Tue, May 12, 2009 at 9:48 PM, Wayne Pope wrote:

>
> I have this request:
>
>
> http://localhost:8983/solr/select?start=0&rows=20&qt=dismax&q=copy&hl=true&hl.snippets=4&hl.fragsize=50&facet=true&facet.mincount=1&facet.limit=8&facet.field=type&fq=company-id%3A1&wt=javabin&version=2.2
>
> (I've been using this to see it rendered in the browser:
>
> http://localhost:8983/solr/select?indent=on&version=2.2&q=copy&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=features&hl=true&hl.fragsize=50
> )
>
>
> that I've been trying out. I get a good responce - however the hl.fragsize
> is ignored and the hl.fragsize in the solrconfig.xml is ignored. Instead I
> get back the whole document (10,000 chars!) in the doc txt field. And
> bizarely the response header is this:
>

hl.fragsize is relevant only for the snippets created by the highlighter.
The returned fields will always have the complete data for a document. Does
that answer your question?

-- 
Regards,
Shalin Shekhar Mangar.

Re: Restarting tomcat deletes all Solr indexes

2009-05-12 Thread Shalin Shekhar Mangar

You can fix the path of the index in your solrconfig.xml

On Tue, May 12, 2009 at 4:48 PM, KK  wrote:

> One more information I would like to add.
>  The entry in solr stats page says this:
>
> readerDir : org.apache.lucene.store.FSDirectory@/home/kk/solr/data/index
>
> when I ran from /home/kk
> and this:
>
> readerDir : org.apache.lucene.store.FSDirectory@
> /home/kk/junk/solr/data/index
>
> after running from /home/kk/junk
>
> That assures the me the problem, but what is the solution?
>
> Thanks,
> KK.
>
> On Tue, May 12, 2009 at 4:41 PM, KK  wrote:
>
> > Thanks for your response @aklochkov.
> >  But I again noticed that something is wrong in my solr/tomcat config[I
> > spent a lot of time making solr run], b'coz in the solr admin page [
> > http://localhost:8080/solr/admin/] what I see is that the $CWD is the
> > location where from I restarted tomcat and seems this $cwd gets picked
> and
> > used for index data[Is it the default behavior? or something wrong from
> my
> > side?, or may be I'm asking some stupid question ].
> >  Once I was in /etc and from there I restarted the tomcat and when I
> tried
> > to open the solr admin page I found an error saying that can not create
> > index directory some permission issue I think [it gave a directory str
> like
> > /etc/solr/index ... ]. I'm pretty sure something is wrong in
> configuration.
> > One more thing assures me about this is the fact that I found many solr
> > index directories here and there[ these are I think the locations where I
> > was when I restarted tomcat at that time ]. Earlier I was using the
> > java_opts to set the solr home like this
> >
> >  export JAVA_OPTS="$JAVA_OPTS -D/usr/local/solr"#in .bashrc
> >
> > but I commented that and instead added the jndi entry in
> > /usr/local/tomcat/webapps/solr/WEB-INF/web.xml as this
> >
> > 
> >solr/home
> >/usr/local/solr
> >java.lang.String
> > 
> >
> > Even the entry SolrHome in solr admin page say that SolrHome is
> > "/usr/loca/solr" but the index gets created in $CWD. Is it the case that
> I
> > created entries for SolrHome in multiple places? which is obviously
> wrong.
> > Can someone point me what is the issue. Thank you very much.
> >
> > --KK
> >
> >
> >
> > On Tue, May 12, 2009 at 2:39 PM, Andrey Klochkov <
> > akloch...@griddynamics.com> wrote:
> >
> >> Hi,
> >>
> >> I know that when starting Solr checks index directory existence, and
> >> creates
> >> new fresh index if it doesn't exist. Does it help? If no, the next step
> >> I'd
> >> do in your case is patching SolrCore.initIndex method - insert some
> >> logging,
> >> or run EmbeddedSolrServer with debugger etc.
> >>
> >> On Mon, May 11, 2009 at 1:25 PM, KK  wrote:
> >>
> >> > Hi,
> >> > I'm facing a silly problem. Every time I restart tomcat all the
> indexes
> >> are
> >> > lost. I used all the default configurations. I'm pretty sure there
> must
> >> be
> >> > some basic changes to fix this. I'd highly appreciate if someone could
> >> > direct me fixing this.
> >> >
> >> > Thanks,
> >> > KK.
> >> >
> >>
> >>
> >> --
> >> Andrew Klochkov
> >>
> >
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: AW: Geographical search based on latitude and longitude

2009-05-12 Thread Grant Ingersoll

Yes, that is part of it, but there is more to it.  See Yonik's comment  
about needs further down.



On May 12, 2009, at 7:36 AM, Norman Leutner wrote:


So are you using boundary box to find results within a given range(km)
like mentioned here: http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html 
 ?



Best regards

Norman Leutner
all2e GmbH

-Ursprüngliche Nachricht-
Von: Grant Ingersoll [mailto:gsing...@apache.org]
Gesendet: Dienstag, 12. Mai 2009 13:18
An: solr-user@lucene.apache.org
Betreff: Re: Geographical search based on latitude and longitude

See https://issues.apache.org/jira/browse/SOLR-773.  In other words,
we're working on it and would love some help!

-Grant

On May 12, 2009, at 7:12 AM, Norman Leutner wrote:


Hi together,

I'm new to Solr and want to port a geographical range search from
MySQL to Solr.

Currently I'm using some mathematical functions (based on GRS80
modell) directly within MySQL to calculate
the actual distance from the locations within the database to a
current location (lat and long are known):

$query=SELECT street, zip, city, state, country, ".
$radius."*ACOS(cos(RADIANS(latitude))*cos(".
$theta.")*(sin(RADIANS(longitude))*sin(".$phi.")
+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(".
$theta.")) AS Distance FROM ezgis_position WHERE ".
$radius."*ACOS(cos(RADIANS(latitude))*cos(".
$theta.")*(sin(RADIANS(longitude))*sin(".$phi.")
+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(".
$theta.")) <= ".$range." ORDER BY Distance";

This works pretty fine and fast. Due to we want to include this
within our Solr search result I would like to have a attribute like
"actual_distance" within the result. Is there a way to use those
functions like (radians, sin, acos,...) directly within Solr?

Thanks in advance for any feedback
Norman Leutner


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

Replication master+slave

2009-05-12 Thread Bryan Talbot

For replication in 1.4, the wiki at http://wiki.apache.org/solr/SolrReplication 
 says that a node can be both the master and a slave:


A node can act as both master and slave. In that case both the master  
and slave configuration lists need to be present inside the  
ReplicationHandler requestHandler in the solrconfig.xml.


What does this mean?  Does the core then poll itself for updates?

I'd like to have a single set of configuration files that are shared  
by masters and slaves and avoid duplicating configuration details in  
multiple files (one for master and one for slave) to ease management  
and failover.  Is this possible?


When I attempt to setup a multi server master-slave configuration and  
include both master and slave replication configuration options, I  
into some problems.  I'm  running a nightly build from May 7.



  

  commit


  http://master_core01:8983/solr/core01/ 
replication

  00:00:60

  


When the replication admin page (http://master_core01:8983/solr/core01/ 
admin/replication/index.jsp) is visited, the severe error show below  
appears in the solr log.  The server is otherwise idle so there is no  
reason all threads should be busy unless the replication code is  
getting itself into a loop.


What's the right way to do this?



May 11, 2009 8:01:22 PM org.apache.tomcat.util.threads.ThreadPool  
logFull
SEVERE: All threads (150) are currently busy, waiting. Increase  
maxThreads (150) or check the servlet status
May 11, 2009 8:01:41 PM org.apache.solr.handler.ReplicationHandler  
getReplicationDetails

WARNING: Exception while invoking a 'details' method on master
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java: 
218)
at java.io.BufferedInputStream.read(BufferedInputStream.java: 
237)
at  
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at  
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at  
org 
.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java: 
1116)
at  
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager 
$ 
HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java: 
1413)
at  
org 
.apache 
.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java: 
1973)
at  
org 
.apache 
.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java: 
1735)
at  
org 
.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java: 
1098)
at  
org 
.apache 
.commons 
.httpclient 
.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at  
org 
.apache 
.commons 
.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java: 
171)
at  
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java: 
397)
at  
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java: 
323)
at  
org 
.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java: 
183)
at  
org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java: 
178)
at  
org 
.apache 
.solr 
.handler 
.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:555)
at  
org 
.apache 
.solr 
.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java: 
147)
at  
org 
.apache 
.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
at  
org 
.apache.jsp.admin.replication.index_jsp.executeCommand(index_jsp.java: 
34)
at  
org.apache.jsp.admin.replication.index_jsp._jspService(index_jsp.java: 
208)
at  
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at  
org 
.apache 
.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:331)
at  
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:329)
at  
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 
269)
at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at  
org 
.apache 
.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java: 
679)
at  
org 
.apache 
.catalina 
.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java: 
461)
at  
org 
.apache 
.catalina 
.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:399)
at  
org 
.apache 
.catalina 
.core.Appl

Re: Solr Loggin issue

2009-05-12 Thread Jay Hill

Usually that means there is another log4j.properties or log4j.xml file in
your classpath that is being found before the one you are intending to use.
Check your classpath for other versions of these files.

-Jay


On Tue, May 12, 2009 at 3:38 AM, Sagar Khetkade
wrote:

>
> Hi,
> I have solr implemented in multi-core scenario and also  implemented
> solr-560-slf4j.patch for implementing the logging. But the problem I am
> facing is that the logs are going to the stdout.log file not the log file
> that I have mentioned in the log4j.properties file. Can anybody give me work
> round  to make logs go into the logger mentioned in log4j.properties file.
> Thanks in advance.
>
> Regards,
> Sagar Khetkade
> _
> Live Search extreme As India feels the heat of poll season, get all the
> info you need on the MSN News Aggregator
> http://news.in.msn.com/National/indiaelections2009/aggregator/default.aspx
>

Newbie question

2009-05-12 Thread Wayne Pope


Hi,

We're implemented search into our product here at our very small company,
and the developer who integrated Solr has left. I'm picking up the code base
and have run into a problem , which I imagine is simple to solve.

I have this request:

http://localhost:8983/solr/select?start=0&rows=20&qt=dismax&q=copy&hl=true&hl.snippets=4&hl.fragsize=50&facet=true&facet.mincount=1&facet.limit=8&facet.field=type&fq=company-id%3A1&wt=javabin&version=2.2

(I've been using this to see it rendered in the browser:
http://localhost:8983/solr/select?indent=on&version=2.2&q=copy&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=features&hl=true&hl.fragsize=50
)


that I've been trying out. I get a good responce - however the hl.fragsize
is ignored and the hl.fragsize in the solrconfig.xml is ignored. Instead I
get back the whole document (10,000 chars!) in the doc txt field. And
bizarely the response header is this:


−

0
0
−


50
on
features
standard
−

on
true

2.2
10
*,score
0
copy
standard


−

So it seems that the hl.fragsize was taken into account.

I'm sure I'm being dumb but I don't know how to solve this. Any ideas?
many thanks
-- 
View this message in context: 
http://www.nabble.com/Newbie-question-tp23505802p23505802.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber

I mean you can sort the facet results by frequency, which happens to
be the default behavior.

Here is an example field for your schema:

stored="true" multiValued="true" />

Here is an example query:

http://localhost:8983/solr/select?q=textfield:copper&facet=true&facet.field=textfieldfacet&facet.limit=5

This will give you the top 5 words in the textfieldfacet.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com

On May 12, 2009, at 7:57 AM, sachin78 wrote:

Thanks Matt for your reply.

What do you mean by frequency(the default)?

Can you please provide an example schema and query will look like.

--Sachin

Matt Weber-2 wrote:

You may have to take care of this at index time. You can create a
new

multivalued field that has minimal processing. Then at index time,
index the full contents of textfield as normal, but then also split
it

on whitespace and index each word in the new field you just created.
Now you will be able to facet on this new field and sort the facet by
frequency (the default) to get the most popular words.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com

On May 12, 2009, at 7:33 AM, sachin78 wrote:

Does anybody have answer to this post.I have a similar requirement.

Suppose I have free text field say
I index the field.If I search for textfield:copper.I have to get
facet

counts for the most common words found in a textfield.
ie.

example:search for textfield:glass
should return facet counts for common words found textfield.
semiconductor(10),iron(20), silicon (25) material (8) thin(25) and
so on.
Can this be done using tagging or MLT.

Thanks,
Sachin

Raju444us wrote:

I have a requirement. If I search for text field let's say
"metal:glass"
what i want is to get the facet counts for all the terms related to
"glass" in my search results.

window(100) since a window can be glass.
plastic(10) plastic is a material just like glass
Iron(10)
Paper(15)

Can I use MLT to get this functionality.Please let me know how
can I

achieve this.If possible an example query.

Thanks,
Raju

--
View this message in context:
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23504241.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to deal with "Mark invalid"?

2009-05-12 Thread Yonik Seeley

I just committed a minor match suggested by Jim Murphy in SOLR-42 to
slightly lower the safe read ahead limit to avoid reading beyond a a
mark.  Could you try out trunk (or wait until the next nightly build?)

-Yonik
http://www.lucidimagination.com

On Tue, May 12, 2009 at 10:57 AM, Nikolai Derzhak  wrote:
> OK. I've applied dirty hack as temporary solution:
>
> in src/java/org/apache/solr/analysis/HTMLStripReader.java of 1.4-dev   -
> enclosed io.reset in try structure.
>
> ( * @version $Id: HTMLStripReader.java 646799 2008-04-10 13:36:23Z yonik $)
> "
>  private void restoreState() throws IOException {
>    try {
>      in.reset();
>    } catch (Exception e) {
>    }
>    pushed.setLength(0);
>  }
>
> "
>
> But how to resolve this problem more civilized ?
>
> On Tue, May 12, 2009 at 12:20 PM, Nikolai Derzhak wrote:
>
>> Good day, people.
>>
>> We use solr to search in mailboxes (dovecot).
>> But with some "bad" messages solr 1.4-dev generate error:
>> "
>> SEVERE: java.io.IOException: Mark invalid
>> at java.io.BufferedReader.reset(BufferedReader.java:485)
>> at
>> org.apache.solr.analysis.HTMLStripReader.restoreState(HTMLStripReader.java:171
>>
>> .
>> "
>> It's issue known as SOLR-42.
>>
>> How i can log field stored in index (i need message uid) ?
>>
>> How to ignore such error and/or message ?
>>
>> Thanks
>

Re: How to deal with "Mark invalid"?

2009-05-12 Thread Nikolai Derzhak

OK. I've applied dirty hack as temporary solution:

in src/java/org/apache/solr/analysis/HTMLStripReader.java of 1.4-dev   -
enclosed io.reset in try structure.

( * @version $Id: HTMLStripReader.java 646799 2008-04-10 13:36:23Z yonik $)
"
  private void restoreState() throws IOException {
try {
  in.reset();
} catch (Exception e) {
}
pushed.setLength(0);
  }

"

But how to resolve this problem more civilized ?

On Tue, May 12, 2009 at 12:20 PM, Nikolai Derzhak wrote:

> Good day, people.
>
> We use solr to search in mailboxes (dovecot).
> But with some "bad" messages solr 1.4-dev generate error:
> "
> SEVERE: java.io.IOException: Mark invalid
> at java.io.BufferedReader.reset(BufferedReader.java:485)
> at
> org.apache.solr.analysis.HTMLStripReader.restoreState(HTMLStripReader.java:171
>
> .
> "
> It's issue known as SOLR-42.
>
> How i can log field stored in index (i need message uid) ?
>
> How to ignore such error and/or message ?
>
> Thanks

Re: Facet counts for common terms of the searched field

2009-05-12 Thread sachin78


Thanks Matt for your reply.

What do you mean by frequency(the default)?

Can you please provide an example schema and query will look like.

--Sachin


Matt Weber-2 wrote:
> 
> You may have to take care of this at index time.  You can create a new  
> multivalued field that has minimal processing.  Then at index time,  
> index the full contents of textfield as normal, but then also split it  
> on whitespace and index each word in the new field you just created.   
> Now you will be able to facet on this new field and sort the facet by  
> frequency (the default) to get the most popular words.
> 
> Thanks,
> 
> Matt Weber
> eSr Technologies
> http://www.esr-technologies.com
> 
> 
> 
> 
> On May 12, 2009, at 7:33 AM, sachin78 wrote:
> 
>>
>> Does anybody have answer to this post.I have a similar requirement.
>>
>> Suppose I have free text field say
>> I index the field.If I search for textfield:copper.I have to get facet
>> counts for the most common words found in a textfield.
>> ie.
>>
>> example:search for textfield:glass
>> should return facet counts for common words found textfield.
>> semiconductor(10),iron(20), silicon (25) material (8) thin(25) and  
>> so on.
>> Can this be done using tagging or MLT.
>>
>> Thanks,
>> Sachin
>>
>>
>> Raju444us wrote:
>>>
>>> I have a requirement. If I search for text field let's say  
>>> "metal:glass"
>>> what i want is to get the facet counts for all the terms related to
>>> "glass" in my search results.
>>>
>>> window(100)  since a window can be glass.
>>> plastic(10)  plastic is a material just like glass
>>> Iron(10)
>>> Paper(15)
>>>
>>> Can I use MLT to get this functionality.Please let me know how can I
>>> achieve this.If possible an example query.
>>>
>>> Thanks,
>>> Raju
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23504241.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber

You may have to take care of this at index time.  You can create a new  
multivalued field that has minimal processing.  Then at index time,  
index the full contents of textfield as normal, but then also split it  
on whitespace and index each word in the new field you just created.   
Now you will be able to facet on this new field and sort the facet by  
frequency (the default) to get the most popular words.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 7:33 AM, sachin78 wrote:



Does anybody have answer to this post.I have a similar requirement.

Suppose I have free text field say
I index the field.If I search for textfield:copper.I have to get facet
counts for the most common words found in a textfield.
ie.

example:search for textfield:glass
should return facet counts for common words found textfield.
semiconductor(10),iron(20), silicon (25) material (8) thin(25) and  
so on.

Can this be done using tagging or MLT.

Thanks,
Sachin


Raju444us wrote:


I have a requirement. If I search for text field let's say  
"metal:glass"

what i want is to get the facet counts for all the terms related to
"glass" in my search results.

window(100)  since a window can be glass.
plastic(10)  plastic is a material just like glass
Iron(10)
Paper(15)

Can I use MLT to get this functionality.Please let me know how can I
achieve this.If possible an example query.

Thanks,
Raju



--
View this message in context: 
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facet counts for common terms of the searched field

2009-05-12 Thread sachin78

Does anybody have answer to this post.I have a similar requirement.

Suppose I have free text field say
I index the field.If I search for textfield:copper.I have to get facet
counts for the most common words found in a textfield.
ie.

example:search for textfield:glass
should return facet counts for common words found textfield.
semiconductor(10),iron(20), silicon (25) material (8) thin(25) and so on.
Can this be done using tagging or MLT.

Thanks,
Sachin

Raju444us wrote:
> 
> I have a requirement. If I search for text field let's say "metal:glass"
> what i want is to get the facet counts for all the terms related to
> "glass" in my search results.
> 
> window(100)  since a window can be glass.
> plastic(10)  plastic is a material just like glass
> Iron(10)
> Paper(15)
> 
> Can I use MLT to get this functionality.Please let me know how can I
> achieve this.If possible an example query.
> 
> Thanks,
> Raju
> 

-- 
View this message in context: 
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: fieldType without tokenizer

2009-05-12 Thread Koji Sekiguchi


It must be KeywordTokenizer*Factory* :)

Koji

sunnyfr wrote:

hi

I tried but Ive an error :
May 12 15:48:51 solr-test jsvc.exec[2583]: May 12, 2009 3:48:51 PM
org.apache.solr.common.SolrException log SEVERE:
org.apache.solr.common.SolrException: Error loading class
'solr.KeywordTokenizer' ^Iat
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310)
^Iat
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325)
^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:804)
^Iat org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58) ^Iat
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:425) ^Iat
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:443) ^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:452)
^Iat org.apache.solr.schema.In

with :

  



  





Shalin Shekhar Mangar wrote:
  

On Mon, May 4, 2009 at 9:28 PM, sunnyfr  wrote:



Hi,

I would like to create a field without tokenizer but I've an error,

  

You can use KeywordTokenizer which does not do any tokenization.

--
Regards,
Shalin Shekhar Mangar.

Re: fieldType without tokenizer

2009-05-12 Thread Erik Hatcher


Use KeywordTokenizerFactory.  Pasted from Solr's example schema.xml:

  

   Erik


On May 12, 2009, at 9:49 AM, sunnyfr wrote:



hi

I tried but Ive an error :
May 12 15:48:51 solr-test jsvc.exec[2583]: May 12, 2009 3:48:51 PM
org.apache.solr.common.SolrException log SEVERE:
org.apache.solr.common.SolrException: Error loading class
'solr.KeywordTokenizer' ^Iat
org
.apache
.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310)
^Iat
org
.apache
.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325)
^Iat
org
.apache
.solr
.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
^Iat
org
.apache
.solr
.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat  
org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:804)
^Iat org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java: 
58) ^Iat

org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:425) ^Iat
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:443) ^Iat
org 
.apache 
.solr 
.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java: 
452)

^Iat org.apache.solr.schema.In

with :
   
 
   
   
   
 
   




Shalin Shekhar Mangar wrote:


On Mon, May 4, 2009 at 9:28 PM, sunnyfr  wrote:



Hi,

I would like to create a field without tokenizer but I've an error,



You can use KeywordTokenizer which does not do any tokenization.

--
Regards,
Shalin Shekhar Mangar.




--
View this message in context: 
http://www.nabble.com/fieldType-without-tokenizer-tp23371300p23502994.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: fieldType without tokenizer

2009-05-12 Thread sunnyfr


hi

I tried but Ive an error :
May 12 15:48:51 solr-test jsvc.exec[2583]: May 12, 2009 3:48:51 PM
org.apache.solr.common.SolrException log SEVERE:
org.apache.solr.common.SolrException: Error loading class
'solr.KeywordTokenizer' ^Iat
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310)
^Iat
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325)
^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:804)
^Iat org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58) ^Iat
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:425) ^Iat
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:443) ^Iat
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
^Iat org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:452)
^Iat org.apache.solr.schema.In

with :

  



  





Shalin Shekhar Mangar wrote:
> 
> On Mon, May 4, 2009 at 9:28 PM, sunnyfr  wrote:
> 
>>
>> Hi,
>>
>> I would like to create a field without tokenizer but I've an error,
>>
> 
> You can use KeywordTokenizer which does not do any tokenization.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/fieldType-without-tokenizer-tp23371300p23502994.html
Sent from the Solr - User mailing list archive at Nabble.com.

AW: Geographical search based on latitude and longitude

2009-05-12 Thread Norman Leutner

So are you using boundary box to find results within a given range(km)
like mentioned here: 
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html ?


Best regards

Norman Leutner
all2e GmbH

-Ursprüngliche Nachricht-
Von: Grant Ingersoll [mailto:gsing...@apache.org] 
Gesendet: Dienstag, 12. Mai 2009 13:18
An: solr-user@lucene.apache.org
Betreff: Re: Geographical search based on latitude and longitude

See https://issues.apache.org/jira/browse/SOLR-773.  In other words,  
we're working on it and would love some help!

-Grant

On May 12, 2009, at 7:12 AM, Norman Leutner wrote:

> Hi together,
>
> I'm new to Solr and want to port a geographical range search from  
> MySQL to Solr.
>
> Currently I'm using some mathematical functions (based on GRS80  
> modell) directly within MySQL to calculate
> the actual distance from the locations within the database to a  
> current location (lat and long are known):
>
> $query=SELECT street, zip, city, state, country, ". 
> $radius."*ACOS(cos(RADIANS(latitude))*cos(". 
> $theta.")*(sin(RADIANS(longitude))*sin(".$phi.") 
> +cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(". 
> $theta.")) AS Distance FROM ezgis_position WHERE ". 
> $radius."*ACOS(cos(RADIANS(latitude))*cos(". 
> $theta.")*(sin(RADIANS(longitude))*sin(".$phi.") 
> +cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(". 
> $theta.")) <= ".$range." ORDER BY Distance";
>
> This works pretty fine and fast. Due to we want to include this  
> within our Solr search result I would like to have a attribute like  
> "actual_distance" within the result. Is there a way to use those  
> functions like (radians, sin, acos,...) directly within Solr?
>
> Thanks in advance for any feedback
> Norman Leutner

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search

Re: Restarting tomcat deletes all Solr indexes

2009-05-12 Thread KK

One more information I would like to add.
 The entry in solr stats page says this:

readerDir : org.apache.lucene.store.FSDirectory@/home/kk/solr/data/index

when I ran from /home/kk
and this:

readerDir : org.apache.lucene.store.FSDirectory@
/home/kk/junk/solr/data/index

after running from /home/kk/junk

That assures the me the problem, but what is the solution?

Thanks,
KK.

On Tue, May 12, 2009 at 4:41 PM, KK  wrote:

> Thanks for your response @aklochkov.
>  But I again noticed that something is wrong in my solr/tomcat config[I
> spent a lot of time making solr run], b'coz in the solr admin page [
> http://localhost:8080/solr/admin/] what I see is that the $CWD is the
> location where from I restarted tomcat and seems this $cwd gets picked and
> used for index data[Is it the default behavior? or something wrong from my
> side?, or may be I'm asking some stupid question ].
>  Once I was in /etc and from there I restarted the tomcat and when I tried
> to open the solr admin page I found an error saying that can not create
> index directory some permission issue I think [it gave a directory str like
> /etc/solr/index ... ]. I'm pretty sure something is wrong in configuration.
> One more thing assures me about this is the fact that I found many solr
> index directories here and there[ these are I think the locations where I
> was when I restarted tomcat at that time ]. Earlier I was using the
> java_opts to set the solr home like this
>
>  export JAVA_OPTS="$JAVA_OPTS -D/usr/local/solr"#in .bashrc
>
> but I commented that and instead added the jndi entry in
> /usr/local/tomcat/webapps/solr/WEB-INF/web.xml as this
>
> 
>solr/home
>/usr/local/solr
>java.lang.String
> 
>
> Even the entry SolrHome in solr admin page say that SolrHome is
> "/usr/loca/solr" but the index gets created in $CWD. Is it the case that I
> created entries for SolrHome in multiple places? which is obviously wrong.
> Can someone point me what is the issue. Thank you very much.
>
> --KK
>
>
>
> On Tue, May 12, 2009 at 2:39 PM, Andrey Klochkov <
> akloch...@griddynamics.com> wrote:
>
>> Hi,
>>
>> I know that when starting Solr checks index directory existence, and
>> creates
>> new fresh index if it doesn't exist. Does it help? If no, the next step
>> I'd
>> do in your case is patching SolrCore.initIndex method - insert some
>> logging,
>> or run EmbeddedSolrServer with debugger etc.
>>
>> On Mon, May 11, 2009 at 1:25 PM, KK  wrote:
>>
>> > Hi,
>> > I'm facing a silly problem. Every time I restart tomcat all the indexes
>> are
>> > lost. I used all the default configurations. I'm pretty sure there must
>> be
>> > some basic changes to fix this. I'd highly appreciate if someone could
>> > direct me fixing this.
>> >
>> > Thanks,
>> > KK.
>> >
>>
>>
>> --
>> Andrew Klochkov
>>
>
>

Re: Geographical search based on latitude and longitude

2009-05-12 Thread Grant Ingersoll

See https://issues.apache.org/jira/browse/SOLR-773.  In other words,  
we're working on it and would love some help!


-Grant

On May 12, 2009, at 7:12 AM, Norman Leutner wrote:


Hi together,

I'm new to Solr and want to port a geographical range search from  
MySQL to Solr.


Currently I'm using some mathematical functions (based on GRS80  
modell) directly within MySQL to calculate
the actual distance from the locations within the database to a  
current location (lat and long are known):


$query=SELECT street, zip, city, state, country, ". 
$radius."*ACOS(cos(RADIANS(latitude))*cos(". 
$theta.")*(sin(RADIANS(longitude))*sin(".$phi.") 
+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(". 
$theta.")) AS Distance FROM ezgis_position WHERE ". 
$radius."*ACOS(cos(RADIANS(latitude))*cos(". 
$theta.")*(sin(RADIANS(longitude))*sin(".$phi.") 
+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(". 
$theta.")) <= ".$range." ORDER BY Distance";


This works pretty fine and fast. Due to we want to include this  
within our Solr search result I would like to have a attribute like  
"actual_distance" within the result. Is there a way to use those  
functions like (radians, sin, acos,...) directly within Solr?


Thanks in advance for any feedback
Norman Leutner


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

Geographical search based on latitude and longitude

2009-05-12 Thread Norman Leutner

Hi together,

I'm new to Solr and want to port a geographical range search from MySQL to Solr.

Currently I'm using some mathematical functions (based on GRS80 modell) 
directly within MySQL to calculate
the actual distance from the locations within the database to a current 
location (lat and long are known):

$query=SELECT street, zip, city, state, country, 
".$radius."*ACOS(cos(RADIANS(latitude))*cos(".$theta.")*(sin(RADIANS(longitude))*sin(".$phi.")+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(".$theta."))
 AS Distance FROM ezgis_position WHERE 
".$radius."*ACOS(cos(RADIANS(latitude))*cos(".$theta.")*(sin(RADIANS(longitude))*sin(".$phi.")+cos(RADIANS(longitude))*cos(".$phi."))+sin(RADIANS(latitude))*sin(".$theta."))
 <= ".$range." ORDER BY Distance";

This works pretty fine and fast. Due to we want to include this within our Solr 
search result I would like to have a attribute like "actual_distance" within 
the result. Is there a way to use those functions like (radians, sin, acos,...) 
directly within Solr?

Thanks in advance for any feedback
Norman Leutner

Re: Restarting tomcat deletes all Solr indexes

2009-05-12 Thread KK

Thanks for your response @aklochkov.
 But I again noticed that something is wrong in my solr/tomcat config[I
spent a lot of time making solr run], b'coz in the solr admin page [
http://localhost:8080/solr/admin/] what I see is that the $CWD is the
location where from I restarted tomcat and seems this $cwd gets picked and
used for index data[Is it the default behavior? or something wrong from my
side?, or may be I'm asking some stupid question ].
 Once I was in /etc and from there I restarted the tomcat and when I tried
to open the solr admin page I found an error saying that can not create
index directory some permission issue I think [it gave a directory str like
/etc/solr/index ... ]. I'm pretty sure something is wrong in configuration.
One more thing assures me about this is the fact that I found many solr
index directories here and there[ these are I think the locations where I
was when I restarted tomcat at that time ]. Earlier I was using the
java_opts to set the solr home like this

 export JAVA_OPTS="$JAVA_OPTS -D/usr/local/solr"#in .bashrc

but I commented that and instead added the jndi entry in
/usr/local/tomcat/webapps/solr/WEB-INF/web.xml as this

   solr/home
   /usr/local/solr
   java.lang.String

Even the entry SolrHome in solr admin page say that SolrHome is
"/usr/loca/solr" but the index gets created in $CWD. Is it the case that I
created entries for SolrHome in multiple places? which is obviously wrong.
Can someone point me what is the issue. Thank you very much.

--KK

On Tue, May 12, 2009 at 2:39 PM, Andrey Klochkov  wrote:

> Hi,
>
> I know that when starting Solr checks index directory existence, and
> creates
> new fresh index if it doesn't exist. Does it help? If no, the next step I'd
> do in your case is patching SolrCore.initIndex method - insert some
> logging,
> or run EmbeddedSolrServer with debugger etc.
>
> On Mon, May 11, 2009 at 1:25 PM, KK  wrote:
>
> > Hi,
> > I'm facing a silly problem. Every time I restart tomcat all the indexes
> are
> > lost. I used all the default configurations. I'm pretty sure there must
> be
> > some basic changes to fix this. I'd highly appreciate if someone could
> > direct me fixing this.
> >
> > Thanks,
> > KK.
> >
>
>
> --
> Andrew Klochkov
>

Solr Loggin issue

2009-05-12 Thread Sagar Khetkade


Hi,
I have solr implemented in multi-core scenario and also  implemented 
solr-560-slf4j.patch for implementing the logging. But the problem I am facing 
is that the logs are going to the stdout.log file not the log file that I have 
mentioned in the log4j.properties file. Can anybody give me work round  to make 
logs go into the logger mentioned in log4j.properties file.
Thanks in advance.
 
Regards,
Sagar Khetkade
_
Live Search extreme As India feels the heat of poll season, get all the info 
you need on the MSN News Aggregator
http://news.in.msn.com/National/indiaelections2009/aggregator/default.aspx

Re: QueryElevationComponent : hot update of elevate.xml

2009-05-12 Thread Nicolas Pastorino


Hi,

On May 7, 2009, at 6:03 , Noble Paul നോബിള്‍  
नोब्ळ् wrote:



going forward the java based replication is going to be the preferred
means replicating index. It does not support replicating files in the
dataDir , it only supports replicating index files and conf files
(files in conf dir). I was unaware of the fact that it was possible to
put the elevate.xml in dataDir.

reloading on commit is a trivial for a search component. it can
register itself to be an even listener for commit and do a reload of
elevate.xml. This can be a configuration parameter.

true


Thanks for these nice tips and recommendations.
I attached a new version of this requestHandler here : https:// 
issues.apache.org/jira/browse/SOLR-1147.


Would this requestHandler be of any general use and could be part of  
Solr's trunk ?


Thanks in advance,
--
Nicolas Pastorino - eZ Labs






On Wed, May 6, 2009 at 7:08 PM, Nicolas Pastorino  wrote:


On May 6, 2009, at 15:17 , Noble Paul നോബിള്‍  
नोब्ळ् wrote:


Why would you want to write it to the data dir? why can't it be  
in the

same place (conf) ?


Well, fact is that the QueryElevationComponent loads the  
configuration file

( elevate.xml ) either from the data dir, either from the conf dir.
Which means that existing setups using this component maybe using  
either

location.
That is the only reason why i judged necessary to keep supporting  
this

"flexibility".

But this could be simplified, forcing the elevate.xml file to be  
in the conf
dir, and having a system ( the one you proposed, or the request  
handler
attached to the issue ) to reload the configuration from the conf  
dir (
which is currently not possible. While when elevate.xml is stored  
in the

dataDir, triggering a commit would reload it ).
I was just unsure about all ins and outs of the Elevation system,  
and then

did not remove this flexibility.

Thanks for your expert eye on this !


On Wed, May 6, 2009 at 6:43 PM, Nicolas Pastorino   
wrote:


Hello,

On May 6, 2009, at 15:02 , Noble Paul നോബിള്‍  
नोब्ळ् wrote:


The elevate.xml is loaded from conf dir when the core is  
reloaded . if

you post the new xml you will have to reload the core.

A simple solution would be to write a RequestHandler which extends
QueryElevationComponent which can be a listener for commit and  
call an

super.inform() on that event


You may want to have a look at this issue :
https://issues.apache.org/jira/browse/SOLR-1147
The proposed solution ( new request handler, attached to the  
ticket ),

solves the issue in both cases :
* when elevate.xml is in the DataDir.
* when elevate.xml is in the conf dir.

Basically this new request handler receives, as XML, the new
configuration,
writes it to the right place ( some logic was copied from the
QueryElevationComponent.inform() code ), and then calls the  
inform()

method
on the QueryElevationComponent for the current core, as you  
suggested

above,
to reload the Elevate configuration.
--
Nicolas


On Fri, Apr 10, 2009 at 5:18 PM, Nicolas Pastorino   
wrote:


Hello !


Browsing the mailing-list's archives did not help me find the  
answer,

hence
the question asked directly here.

Some context first :
Integrating Solr with a CMS ( eZ Publish ), we chose to support
Elevation.
The idea is to be able to 'elevate' any object from the CMS.  
This can

be
achieved through eZ Publish's back office, with a dedicated  
Elevate
administration GUI, the configuration is stored in the CMS  
temporarily,

and
then synchronized frequently and/or on demand onto Solr. This
synchronisation is currently done as follows :
1. Generate the elevate.xml based on the stored configuration
2. Replace elevate.xml in Solr's dataDir
3. Commit. It appears that when having elevate.xml in Solr's  
dataDir,

and
solely in this case, commiting triggers a reload of  
elevate.xml. This

does
not happen when elevate.xml is stored in Solr's conf dir.


This method has one main issue though : eZ Publish needs to  
have access

to
the same filesystem as the one on which Solr's dataDir is  
stored. This

is
not always the case when the CMS is clustered for instance -->  
show

stopper
:(

Hence the following idea / RFC :
How about extending the Query Elevation system with the  
possibility to

push
an updated elevate.xml file/XML through HTTP ?
This would update the file where it is actually located, and  
trigger a

reload of the configuration.
Not being very knowledgeable about Solr's API ( yet ! ), i cannot
figure
out
whether this would be possible, how this would be achievable  
( which

type
of
plugin for instance ) or even be valid ?

Thanks a lot in advance for your thoughts,
--
Nicolas








--
-
Noble Paul | Principal Engineer| AOL | http://aol.com


--
Nicolas Pastorino
Consultant - Trainer - System Developer
Phone :  +33 (0)4.78.37.01.34
eZ Systems ( Western Europe )  |  http://ez.no









--
-
Noble Pa

Custom Servlet Filter, Where to put filter-mappings

2009-05-12 Thread Jacob Singh

Hi folks,

I just wrote a Servlet Filter to handle authentication for our
service.  Here's what I did:

1. Created a dir in contrib
2. Put my project in there, I took the dataimporthandler build.xml as
an example and modified it to suit my needs.  Worked great!
3. ant dist now builds my jar and includes it

I now need to modify web.xml to add my filter-mapping, init params,
etc.  How can I do this cleanly?  Or do I need to manually open up the
archive and edit it and then re-war it?

In common-build I don't see a target for dist-war, so don't see how it
is possible...

Thanks!
Jacob

-- 

+1 510 277-0891 (o)
+91  33 7458 (m)

web: http://pajamadesign.com

Skype: pajamadesign
Yahoo: jacobsingh
AIM: jacobsingh
gTalk: jacobsi...@gmail.com

How to deal with "Mark invalid"?

2009-05-12 Thread Nikolai Derzhak

Good day, people.

We use solr to search in mailboxes (dovecot).
But with some "bad" messages solr 1.4-dev generate error:
"
SEVERE: java.io.IOException: Mark invalid
at java.io.BufferedReader.reset(BufferedReader.java:485)
at
org.apache.solr.analysis.HTMLStripReader.restoreState(HTMLStripReader.java:171

.
"
It's issue known as SOLR-42.

How i can log field stored in index (i need message uid) ?

How to ignore such error and/or message ?

Thanks

Re: Restarting tomcat deletes all Solr indexes

2009-05-12 Thread Andrey Klochkov

Hi,

I know that when starting Solr checks index directory existence, and creates
new fresh index if it doesn't exist. Does it help? If no, the next step I'd
do in your case is patching SolrCore.initIndex method - insert some logging,
or run EmbeddedSolrServer with debugger etc.

On Mon, May 11, 2009 at 1:25 PM, KK  wrote:

> Hi,
> I'm facing a silly problem. Every time I restart tomcat all the indexes are
> lost. I used all the default configurations. I'm pretty sure there must be
> some basic changes to fix this. I'd highly appreciate if someone could
> direct me fixing this.
>
> Thanks,
> KK.
>

-- 
Andrew Klochkov

45 matches

Mail list logo