New IndexSearcher and autowarming

2011-08-26 Thread Mike Austin
I would like to have the ability to keep requests from being slowed from new
document adds and commits by having a separate index that gets updated.
Basically a read-only and an updatable index. After the update index has
finished updating with new adds and commits, I'd like to switch the update
to the "live" read-only.  At the same time, it would be nice to have the old
read-only index become "updated" with the now live read-only index before I
start this update process again.

1. Index1 is live and read-only and doesn't get slowed by updates
2. Index2 is updated with Index1 and gets new adds and commits
3. Index2 gets cache warming
4. Index2 becomes the live index read-only index
5. Index1 gets synced with Index2 so that when these steps start again, the
updating is happening on an updated index.

I know that this is possible but can't find a simple tutorial on how to do
this.  By the way, I'm using SolrNet in a windows environment.

Thanks,
Mike


Re: New IndexSearcher and autowarming

2011-08-26 Thread Mike Austin
Hi Erick,

It might work.  I've only worked with solr having one index on one server
over a year ago so I might need to just research more about the replication.
I am using windows and I remember that replication on windows had some
issues with scripts and hard links, however it looks like we have some new
good replication features with solr1.4.

For now, I wanted to do this on just one windows server since this is my
requirement.  After your suggestion, I took a little more time to review:
http://wiki.apache.org/solr/SolrReplication.  So based on what I want to do,
would the "Replication with MultiCore " section be what I need to do?  But
this wouldn't be a master/slave setup would it since basically I want to
swap between two.  I guess I could set up 3 indexes on the same server if
that's possible to use master/slave in that way, but that might take some
more space than I anticipated.

Thanks,
Mike
On Fri, Aug 26, 2011 at 12:08 PM, Erick Erickson wrote:

> Why doesn't standard replication with auto-warming work for you?
> You can control how often replication gets triggered by controlling
> your commit points and/or your replication interval. This seems easier
> than maintaining cores like your problem statement indicates.
>
> Best
> Erick
>
> On Fri, Aug 26, 2011 at 12:56 PM, simon  wrote:
> > The multicore API (see http://wiki.apache.org/solr/CoreAdmin ) allows
> you to
> > swap, unload, reload cores. That should allow you to do what you want,
> >
> > -Simon
> >
> > On Fri, Aug 26, 2011 at 11:13 AM, Mike Austin  >wrote:
> >
> >> I would like to have the ability to keep requests from being slowed from
> >> new
> >> document adds and commits by having a separate index that gets updated.
> >> Basically a read-only and an updatable index. After the update index has
> >> finished updating with new adds and commits, I'd like to switch the
> update
> >> to the "live" read-only.  At the same time, it would be nice to have the
> >> old
> >> read-only index become "updated" with the now live read-only index
> before I
> >> start this update process again.
> >>
> >> 1. Index1 is live and read-only and doesn't get slowed by updates
> >> 2. Index2 is updated with Index1 and gets new adds and commits
> >> 3. Index2 gets cache warming
> >> 4. Index2 becomes the live index read-only index
> >> 5. Index1 gets synced with Index2 so that when these steps start again,
> the
> >> updating is happening on an updated index.
> >>
> >> I know that this is possible but can't find a simple tutorial on how to
> do
> >> this.  By the way, I'm using SolrNet in a windows environment.
> >>
> >> Thanks,
> >> Mike
> >>
> >
>


Geo spatial search with multi-valued locations (SOLR-2155 / lucene-spatial-playground)

2011-08-29 Thread Mike Austin
I've been trying to follow the progress of this and I'm not sure what the
current status is.  Can someone update me on what is currently in Solr4 and
does it support multi-valued location in a single document?  I saw that
SOLR-2155 was not included and is now lucene-spatial-playground.

Thanks,
Mike


Re: Geo spatial search with multi-valued locations (SOLR-2155 / lucene-spatial-playground)

2011-08-29 Thread Mike Austin
Besides the full integration into solr for this, would you recommend any
third party solr plugins such as "
http://www.jteam.nl/products/spatialsolrplugin.html";, or others?

I can understand that spacial features can get complex and there could be
many use cases, but this seems like a "basic" feature that you would use
with a standard set of spacial features like what is in solr4 now.

Thanks,
Mike

On Mon, Aug 29, 2011 at 12:38 PM, Darren Govoni  wrote:

> It doesn't.
>
>
> On 08/29/2011 01:37 PM, Mike Austin wrote:
>
>> I've been trying to follow the progress of this and I'm not sure what the
>> current status is.  Can someone update me on what is currently in Solr4
>> and
>> does it support multi-valued location in a single document?  I saw that
>> SOLR-2155 was not included and is now lucene-spatial-playground.
>>
>> Thanks,
>> Mike
>>
>>
>


Solr warming when using master/slave replication

2011-08-29 Thread Mike Austin
How does warming work when a collection is being distributed to a slave.  I
understand that a temp directory is created and it is eventually copied to
the live folder, but what happens to the cache that was built in with the
old index?  Does the cache get rebuilt, can we warm it before it becomes
live, or can we keep the old cache?

Thanks,
Mike


Re: Solr warming when using master/slave replication

2011-08-29 Thread Mike Austin
"Distribution/Replication gives you a 'new' index on the slave. When Solr is
told to use the new index, the old caches have to be discarded along with
the old Index Searcher. That's when autowarming occurs.  If the current
Index Searcher is serving requests and when a new searcher is opened, the
new one is 'warmed' while the current one is serving external requests. When
the new one is ready, it is registered so it can serve any new requests
while the original one first finishes the requests it is handling. "

So if warming is configured, the new index will warm before going live?  How
does that work with the copying to the new directory? Does it get warmed
while in the temp directory before copied over?  My question is basically,
will traffic be served with a non indexed searcher at any point?

Thanks,
Mike

On Mon, Aug 29, 2011 at 4:45 PM, Rob Casson  wrote:

> it's always been my understanding that the caches are discarded, then
> rebuilt/warmed:
>
>
> http://wiki.apache.org/solr/SolrCaching#Caching_and_Distribution.2BAC8-Replication
>
> hth,
> rob
>
> On Mon, Aug 29, 2011 at 5:30 PM, Mike Austin 
> wrote:
> > How does warming work when a collection is being distributed to a slave.
>  I
> > understand that a temp directory is created and it is eventually copied
> to
> > the live folder, but what happens to the cache that was built in with the
> > old index?  Does the cache get rebuilt, can we warm it before it becomes
> > live, or can we keep the old cache?
> >
> > Thanks,
> > Mike
> >
>


Re: Solr warming when using master/slave replication

2011-08-29 Thread Mike Austin
Correction: Will traffic be served with a non "warmed" index searcher at any
point?

Thanks,
Mike

On Mon, Aug 29, 2011 at 4:52 PM, Mike Austin  wrote:

> "Distribution/Replication gives you a 'new' index on the slave. When Solr
> is told to use the new index, the old caches have to be discarded along with
> the old Index Searcher. That's when autowarming occurs.  If the current
> Index Searcher is serving requests and when a new searcher is opened, the
> new one is 'warmed' while the current one is serving external requests. When
> the new one is ready, it is registered so it can serve any new requests
> while the original one first finishes the requests it is handling. "
>
> So if warming is configured, the new index will warm before going live?
> How does that work with the copying to the new directory? Does it get warmed
> while in the temp directory before copied over?  My question is basically,
> will traffic be served with a non indexed searcher at any point?
>
> Thanks,
> Mike
>
>
> On Mon, Aug 29, 2011 at 4:45 PM, Rob Casson  wrote:
>
>> it's always been my understanding that the caches are discarded, then
>> rebuilt/warmed:
>>
>>
>> http://wiki.apache.org/solr/SolrCaching#Caching_and_Distribution.2BAC8-Replication
>>
>> hth,
>> rob
>>
>> On Mon, Aug 29, 2011 at 5:30 PM, Mike Austin 
>> wrote:
>> > How does warming work when a collection is being distributed to a slave.
>>  I
>> > understand that a temp directory is created and it is eventually copied
>> to
>> > the live folder, but what happens to the cache that was built in with
>> the
>> > old index?  Does the cache get rebuilt, can we warm it before it becomes
>> > live, or can we keep the old cache?
>> >
>> > Thanks,
>> > Mike
>> >
>>
>
>


Solr commit process and read downtime

2011-08-31 Thread Mike Austin
I've set up a master slave configuration and it's working great!  I know
this is the better setup but if I had just one index due to requirements,
I'd like to know more about the performance hit of the commit. let's just
assume I have a decent size index of a few gig normal sized documents with
high traffic.  A few questions:

- (main question) When you do a commit on a single index, is there anytime
when the reads will not have an index to search on?
- With the rebuilding of caches and whatever else happens, is the only
downside the fact that the server performance will be degraded due to file
copy, cache warming, etc.. or will the index be actually locked at some
point?
- On a commit, do the files get copied so you need double the space or is
that just the optimize?

I know a master/slave setup is used to reduce these issues, but if I had
only one server I need to know the potential risks.

Thanks,
Mike


Re: Solr commit process and read downtime

2011-09-01 Thread Mike Austin
Wow.. thanks for the great answers Erick!  This answered my concerns
perfectly.

Mike

On Thu, Sep 1, 2011 at 7:54 AM, Erick Erickson wrote:

> See below:
>
> On Wed, Aug 31, 2011 at 2:16 PM, Mike Austin 
> wrote:
> > I've set up a master slave configuration and it's working great!  I know
> > this is the better setup but if I had just one index due to requirements,
> > I'd like to know more about the performance hit of the commit. let's just
> > assume I have a decent size index of a few gig normal sized documents
> with
> > high traffic.  A few questions:
> >
> > - (main question) When you do a commit on a single index, is there
> anytime
> > when the reads will not have an index to search on?
> No. While the new searcher is warming up, all incoming searches are
> handled by the old searcher. When the new searcher is warmed up,
> new requests are routed to it, and when the last search is completed
> in the old searcher, it's shut down
>
> > - With the rebuilding of caches and whatever else happens, is the only
> > downside the fact that the server performance will be degraded due to
> file
> > copy, cache warming, etc.. or will the index be actually locked at some
> > point?
> The index will not be locked, if by locked you mean the searches will
> not happen. See above. The server will certainly have more work to
> do, and if you're running close to the limits you might notice some
> slowdown. But often there is no noticeable pause. Note that while
> all this goes on, you will have *two* copies of the caches etc. in
> memory...
>
> > - On a commit, do the files get copied so you need double the space or is
> > that just the optimize?
> You have to allow for the relatively rare instance when the merge
> process combines all your segments into one, which will require
> at least double the disk space. Optimize guarantees this
> will happen, but it can (and will) happen on commit occasionally.
>
> >
> > I know a master/slave setup is used to reduce these issues, but if I had
> > only one server I need to know the potential risks.
> Well, you're just putting lots of stuff on a server. Solr will quite
> happily deal
> with this situation and, depending upon how much traffic you have and
> your machine's size, this may be fine. Do be aware of the "warmup hell"
> problem and don't commit too frequently or your warming searchers
> may tie their knickers in a knot.
>
> And one risk in this setup is that you have no way to quickly bring up
> a server if your one machine crashes, you have to re-index *all* your data.
>
> Best
> Erick
>
> >
> > Thanks,
> > Mike
> >
>


Running solr on small amounts of RAM

2011-09-09 Thread Mike Austin
I'm trying to push to get solr used in our environment. I know I could have
responses saying WHY can't you get more RAM etc.., but lets just skip those
and work with this situation.

Our index is very small with 100k documents and a light load at the moment.
If I wanted to use the smallest possible RAM on the server, how would I do
this and what are the issues?

I know that caching would be the biggest lose but if solr ran with no to
little caching, the performance would still be ok? I know this is a relative
question..
This is the only application using java on this machine, would tuning java
to use less cache help anything?
I should set the cache settings low in the config?
Basically, what will having a very low cache hit rate do to search speed and
server performance?  I know more is better and it depends on what I'm
comparing it to but if you could just answer in some way saying that it's
not going to cripple the machine or cause 5 second searches?

It's on a windows server.


Thanks,
Mike


Re: Running solr on small amounts of RAM

2011-09-09 Thread Mike Austin
or actually disabling caching as mentioned here:
http://wiki.apache.org/solr/SolrCaching#Cache_Sizing

On Fri, Sep 9, 2011 at 11:48 AM, Mike Austin  wrote:

> I'm trying to push to get solr used in our environment. I know I could have
> responses saying WHY can't you get more RAM etc.., but lets just skip those
> and work with this situation.
>
> Our index is very small with 100k documents and a light load at the
> moment.  If I wanted to use the smallest possible RAM on the server, how
> would I do this and what are the issues?
>
> I know that caching would be the biggest lose but if solr ran with no to
> little caching, the performance would still be ok? I know this is a relative
> question..
> This is the only application using java on this machine, would tuning java
> to use less cache help anything?
> I should set the cache settings low in the config?
> Basically, what will having a very low cache hit rate do to search speed
> and server performance?  I know more is better and it depends on what I'm
> comparing it to but if you could just answer in some way saying that it's
> not going to cripple the machine or cause 5 second searches?
>
> It's on a windows server.
>
>
> Thanks,
> Mike
>
>
>
>


Re: Running solr on small amounts of RAM

2011-09-14 Thread Mike Austin
Just wanted to follow up and say thanks for all the valuable replies.  I'm
in the process of testing everything.

Thanks,
Mike

On Mon, Sep 12, 2011 at 1:20 PM, Chris Hostetter
wrote:

>
> Beyond the suggestions already made, i would add:
>
> a) being really aggressive about stop words can help keep the index size
> down, which can help reduce the amount of memory needed to scan the term
> lists
>
> b) faceting w/o any caching is likelye going to be too slow to be
> acceptible.
>
> c) don't sort on anything except score.
>
> -Hoss
>


Solr sorting question to boost a certain field first

2012-02-29 Thread Mike Austin
I have content that I index for several different domains.  What I'd like
to do is have all search results found for domainA returned first and
results for domainB,C,D..etc.. returned second.  I could do two different
searches but was wondering if there was a way to only do one query but
return results from a certain domain first followed by results from the
rest of the domains second.

I thought about trying to boost but I question if the boost would always
make domainA return first?  Could someone please suggest a way to do this?
Thanks!

Example: Query for "apple" on "domainA" plus give me other domains that
have "apple" in them also

Example results:
1. DomainA, score .85
2. DomainA, score .84
3. DomainA, score .75
4. DomainA, score .65
5. DomainA, score .55
6. DomainA, score .35
--- now network results 
7. DomainC, score .94
8. DomainE, score .75
9. DomainB, score .68
10. DomainG, score .55
11. DomainC, score .35

Thanks,
Mike


Re: Solr sorting question to boost a certain field first

2012-02-29 Thread Mike Austin
Boom!

This works: sort=map(query($qq,-1),0, ,
1)+desc,score+desc&qq=domain:domainA

Thanks,
Mike

On Wed, Feb 29, 2012 at 3:45 PM, Mike Austin  wrote:

> I have content that I index for several different domains.  What I'd like
> to do is have all search results found for domainA returned first and
> results for domainB,C,D..etc.. returned second.  I could do two different
> searches but was wondering if there was a way to only do one query but
> return results from a certain domain first followed by results from the
> rest of the domains second.
>
> I thought about trying to boost but I question if the boost would always
> make domainA return first?  Could someone please suggest a way to do this?
> Thanks!
>
> Example: Query for "apple" on "domainA" plus give me other domains that
> have "apple" in them also
>
> Example results:
> 1. DomainA, score .85
> 2. DomainA, score .84
> 3. DomainA, score .75
> 4. DomainA, score .65
> 5. DomainA, score .55
> 6. DomainA, score .35
> --- now network results 
> 7. DomainC, score .94
> 8. DomainE, score .75
> 9. DomainB, score .68
> 10. DomainG, score .55
> 11. DomainC, score .35
>
> Thanks,
> Mike
>


What is the latest solr version

2012-03-02 Thread Mike Austin
I've heard some people talk about solr4.. but I only see solr 3.5 available.

Thanks


index size with replication

2012-03-13 Thread Mike Austin
I have a master with two slaves.  For some reason on the master if I do an
optimize after indexing on the master it double in size from 42meg to 90
meg.. however,  when the slaves replicate they get the 42meg index..

Should the master and slaves always be the same size?

Thanks,
Mike


read only slaves and write only master

2012-03-14 Thread Mike Austin
Is there a way mark a master as write only and the slaves as read only?  I
guess I could just remove those handlers from the config?

Is there a benefit from doing this as far as performance or anything else?

Thanks,
Mike


Solr Memory Usage

2012-03-14 Thread Mike Austin
I'm looking at the solr admin interface site.  On the dashboard right
panel, I see three sections with size numbers like 227MB(light),
124MB(darker), and 14MB(darkest).

I'm on a windows server.

Couple questions about what I see in the solr app admin interface:

- In the top right section of the dashboard, does the lightest color of the
three with a number of 227MB come from the Xmx256 heap max setting for java
that I set?  Is this the max limit for all my solr apps running on this
instance of tomcat?
- Is the 124MB in the middle slightly darker gray the Xms128 setting I set?
- Is the 47MB of the darkest section the memory of the current solr app
that I'm viewing details on?
- Does the 227mb and 124mb apply to all solr apps? For example if I go look
under solrapp1 it should have the same 227mb and 124mb numbers but a
different darker section number for the memory of the current solrapp that
I'm viewing?
- If the numbers for the Xms and Xmx are the same for all solr apps on this
tomcat instance however the bottom darker memory number is specific to the
app admin that I'm viewing, how do I see the total usage of all solr apps?
Is it under /manager/status? JVM section part with "Free memory: 110.16 MB
Total memory: 124.31 MB Max memory: 227.56 MB"?
- If the Xmx is the max memory allocated to the jvm for tomcat, what is the
Xms used for? Is it to hold memory to not be allocating often and what
happens if you go over that number?
- Also, in windows do the Xmx/Xms settings and memory usage for these solr
apps display in task manager under the memory usage of tomcat application?
Only?

Sorry for the many questions but after google searches and research by a
non-java expert, I am yet to have a clear answers to these questions.

Thanks,
Mike


Re: index size with replication

2012-03-14 Thread Mike Austin
The odd thing is that if I optimize the index it doubles in size.. If I
then, add one more document to the index it goes back down to half size?

Is there a way to force this without needing to wait until another document
is added? Or do you have more information on what you think is going on?
I'm using a trunk version of solr4 from 9/12/2011 with a master with two
slaves setup.  Everything besides this is working great!

Thanks,
Mike

On Tue, Mar 13, 2012 at 9:32 PM, Li Li  wrote:

>  optimize will generate new segments and delete old ones. if your master
> also provides searching service during indexing, the old files may be
> opened by old SolrIndexSearcher. they will be deleted later. So when
> indexing, the index size may double. But a moment later, old indexes will
> be deleted.
>
> On Wed, Mar 14, 2012 at 7:06 AM, Mike Austin 
> wrote:
>
> > I have a master with two slaves.  For some reason on the master if I do
> an
> > optimize after indexing on the master it double in size from 42meg to 90
> > meg.. however,  when the slaves replicate they get the 42meg index..
> >
> > Should the master and slaves always be the same size?
> >
> > Thanks,
> > Mike
> >
>


Re: index size with replication

2012-03-14 Thread Mike Austin
Another note.. if I reload solr app it goes back down in size.

here is my replication settings on the master:


   
 startup
 commit
 optimize
 1
 schema.xml,stopwords.txt,elevate.xml
 00:00:30
   


On Wed, Mar 14, 2012 at 3:54 PM, Mike Austin  wrote:

> The odd thing is that if I optimize the index it doubles in size.. If I
> then, add one more document to the index it goes back down to half size?
>
> Is there a way to force this without needing to wait until another
> document is added? Or do you have more information on what you think is
> going on?  I'm using a trunk version of solr4 from 9/12/2011 with a master
> with two slaves setup.  Everything besides this is working great!
>
> Thanks,
> Mike
>
>
> On Tue, Mar 13, 2012 at 9:32 PM, Li Li  wrote:
>
>>  optimize will generate new segments and delete old ones. if your master
>> also provides searching service during indexing, the old files may be
>> opened by old SolrIndexSearcher. they will be deleted later. So when
>> indexing, the index size may double. But a moment later, old indexes will
>> be deleted.
>>
>> On Wed, Mar 14, 2012 at 7:06 AM, Mike Austin 
>> wrote:
>>
>> > I have a master with two slaves.  For some reason on the master if I do
>> an
>> > optimize after indexing on the master it double in size from 42meg to 90
>> > meg.. however,  when the slaves replicate they get the 42meg index..
>> >
>> > Should the master and slaves always be the same size?
>> >
>> > Thanks,
>> > Mike
>> >
>>
>
>


Re: index size with replication

2012-03-14 Thread Mike Austin
Thanks.  I might just remove the optimize.  I had it planned for once a
week but maybe I'll just do it and restart the app if performance slows.


On Wed, Mar 14, 2012 at 4:37 PM, Dyer, James wrote:

> SOLR-3033 is related to ReplcationHandler's ability to do backups.  It
> allows you to specify how many backups you want to keep.  You don't seem to
> have any backups configured here so it is not an applicable parameter (note
> that SOLR-3033 was committed to trunk recently but the config param was
> made "maxNumberOfBackups" ... see
> http://wiki.apache.org/solr/SolrReplication#Master )
>
> I can only take a wild guess why you have the temporary increase in index
> size.  Could it be that something is locking the old segment files so they
> do not get deleted on optimize?  Then maybe they are subsequently getting
> cleaned up at your next commit and restart ?
>
> Finally, keep in mind that doing optimizes aren't generally recommended
> anymore.  Everyone's situation is different, but if you have good settings
> for "mergeFactor" and "ramBufferSizeMB", then optimize is (probably) not
> going to do anything helpful.
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -Original Message-
> From: Ahmet Arslan [mailto:iori...@yahoo.com]
> Sent: Wednesday, March 14, 2012 4:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: index size with replication
>
>
> > Another note.. if I reload solr app
> > it goes back down in size.
> >
> > here is my replication settings on the master:
> >
> >  > class="solr.ReplicationHandler" >
> >
> >   > name="replicateAfter">startup
> >   > name="replicateAfter">commit
> >   > name="replicateAfter">optimize
> >   > name="numberToKeep">1
> >   > name="confFiles">schema.xml,stopwords.txt,elevate.xml
> >   > name="commitReserveDuration">00:00:30
> >
> > 
>
> Could it be https://issues.apache.org/jira/browse/SOLR-3033 ?
>
>
>


Re: index size with replication

2012-03-14 Thread Mike Austin
Shawn,

Thanks for the detailed answer! I will play around with this information in
hand.  Maybe a second optimize or just a dummy commit after the optimize
will help get me past this.  Both not the best options, but maybe it's a do
it because it's running on windows work-around. If it is indeed a file
locking issue, I think I can probably work around this since my indexing is
scheduled at certain times and not "live" so I could try the optimize again
soon after or do a single commit that seems to fix the issue also.  Or just
not optimize..

Thanks,
Mike

On Wed, Mar 14, 2012 at 6:34 PM, Shawn Heisey  wrote:

> On 3/14/2012 2:54 PM, Mike Austin wrote:
>
>> The odd thing is that if I optimize the index it doubles in size.. If I
>> then, add one more document to the index it goes back down to half size?
>>
>> Is there a way to force this without needing to wait until another
>> document
>> is added? Or do you have more information on what you think is going on?
>> I'm using a trunk version of solr4 from 9/12/2011 with a master with two
>> slaves setup.  Everything besides this is working great!
>>
>
> The not-very-helpful-but-true answer: Don't run on Windows.  I checked
> your prior messages to the list to verify that this is your environment.
>  If you can control index updates so they don't happen at the same time as
> your optimizes, you can also get around this problem by doing the optimize
> twice.  You would have to be absolutely sure that no changes are made to
> the index between the two optimizes, so the second one basically doesn't do
> anything except take care of the deletes.
>
> Nuts and bolts of why this happens: Solr keeps the old files open so the
> existing reader can continue to serve queries.  That reader will not be
> closed until the last query completes, which may not happen until well
> after the time the new reader is completely online and ready.  I assume
> that the delete attempt occurs as soon as the new index segments are
> completely online, before the old reader begins to close.  I've not read
> the source code to find out.
>
> On Linux and other UNIX-like environments, you can delete files while they
> are open by a process.  They continue to exist as in-memory links and take
> up space until those processes close them, at which point they are truly
> gone.  On Windows, an attempt to delete an open file will fail, even if
> it's open read-only.
>
> There are probably a number of ways that this problem could be solved for
> Windows platforms.  The simplest that I can think of, assuming it's even
> possible, would be to wait until the old reader is closed before attempting
> the segment deletion.  That may not be possible - the information may not
> be available to the portion of code that does the deletion.  There are a
> few things standing in the way of me fixing this problem myself: 1) I'm a
> beginning Java programmer.  2) I'm not familiar with the Solr code at all.
> 3) My interest level is low because I run on Linux, not Windows.
>
> Thanks,
> Shawn
>
>


Re: Sort by bayesian function for 5 star rating

2012-03-14 Thread Mike Austin
Why don't you just use that formula and calculate the weighted rating for
each movie and index that value? sort=wrating desc

Maybe I didn't understand your question.

mike

On Mon, Mar 12, 2012 at 1:38 PM, Zac Smith  wrote:

> Does anyone have an example formula that can be used to sort by a 5 star
> rating in SOLR?
> I am looking at an example on IMDB's top 250 movie list:
>
> The formula for calculating the Top Rated 250 Titles gives a true Bayesian
> estimate:
>  weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C
> where:
>R = average for the movie (mean) = (Rating)
>v = number of votes for the movie = (votes)
>m = minimum votes required to be listed in the Top 250 (currently 3000)
>C = the mean vote across the whole report (currently 6.9)
>
>


Re: index size with replication

2012-03-15 Thread Mike Austin
The problem is that when replicating, the double-size index gets replicated
to slaves.  I am now doing a dummy commit with always the same document and
it works fine.. After the optimize and dummy commit process I just end up
with numDocs = x and maxDocs = x+1.  I don't get the nice green checkmark
in the admin interface but I can live with that.

mike

On Thu, Mar 15, 2012 at 8:17 AM, Erick Erickson wrote:

> Or just ignore it if you have the disk space. The files will be cleaned up
> eventually. I believe they'll magically disappear if you simply bounce the
> server (but work on *nix so can't personally guarantee it). And replication
> won't replicate the stale files, so that's not a problem either
>
> Best
> Erick
>
> On Wed, Mar 14, 2012 at 11:54 PM, Mike Austin 
> wrote:
> > Shawn,
> >
> > Thanks for the detailed answer! I will play around with this information
> in
> > hand.  Maybe a second optimize or just a dummy commit after the optimize
> > will help get me past this.  Both not the best options, but maybe it's a
> do
> > it because it's running on windows work-around. If it is indeed a file
> > locking issue, I think I can probably work around this since my indexing
> is
> > scheduled at certain times and not "live" so I could try the optimize
> again
> > soon after or do a single commit that seems to fix the issue also.  Or
> just
> > not optimize..
> >
> > Thanks,
> > Mike
> >
> > On Wed, Mar 14, 2012 at 6:34 PM, Shawn Heisey  wrote:
> >
> >> On 3/14/2012 2:54 PM, Mike Austin wrote:
> >>
> >>> The odd thing is that if I optimize the index it doubles in size.. If I
> >>> then, add one more document to the index it goes back down to half
> size?
> >>>
> >>> Is there a way to force this without needing to wait until another
> >>> document
> >>> is added? Or do you have more information on what you think is going
> on?
> >>> I'm using a trunk version of solr4 from 9/12/2011 with a master with
> two
> >>> slaves setup.  Everything besides this is working great!
> >>>
> >>
> >> The not-very-helpful-but-true answer: Don't run on Windows.  I checked
> >> your prior messages to the list to verify that this is your environment.
> >>  If you can control index updates so they don't happen at the same time
> as
> >> your optimizes, you can also get around this problem by doing the
> optimize
> >> twice.  You would have to be absolutely sure that no changes are made to
> >> the index between the two optimizes, so the second one basically
> doesn't do
> >> anything except take care of the deletes.
> >>
> >> Nuts and bolts of why this happens: Solr keeps the old files open so the
> >> existing reader can continue to serve queries.  That reader will not be
> >> closed until the last query completes, which may not happen until well
> >> after the time the new reader is completely online and ready.  I assume
> >> that the delete attempt occurs as soon as the new index segments are
> >> completely online, before the old reader begins to close.  I've not read
> >> the source code to find out.
> >>
> >> On Linux and other UNIX-like environments, you can delete files while
> they
> >> are open by a process.  They continue to exist as in-memory links and
> take
> >> up space until those processes close them, at which point they are truly
> >> gone.  On Windows, an attempt to delete an open file will fail, even if
> >> it's open read-only.
> >>
> >> There are probably a number of ways that this problem could be solved
> for
> >> Windows platforms.  The simplest that I can think of, assuming it's even
> >> possible, would be to wait until the old reader is closed before
> attempting
> >> the segment deletion.  That may not be possible - the information may
> not
> >> be available to the portion of code that does the deletion.  There are a
> >> few things standing in the way of me fixing this problem myself: 1) I'm
> a
> >> beginning Java programmer.  2) I'm not familiar with the Solr code at
> all.
> >> 3) My interest level is low because I run on Linux, not Windows.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>


Maybe switching to Solr Cores

2012-03-16 Thread Mike Austin
I'm trying to understand the difference between multiple Tomcat indexes
using context fragments versus using one application with multiple cores?
Since I'm currently using tomcat context fragments to run 7 different
indexes, could I get help understanding more why I would want to use solr
cores instead? or if I would?

>From reading the documentation here are the main points that I see..

- manage them as a single application
- create new indexes on the fly by spinning up new SolrCores
- even make one SolrCore replace another SolrCore without ever restarting
your Servlet Container.

It seems that the biggest real-world advantage is the ability to control
core creation and replacement with no downtime.  The negative would be the
isolation however the are still somewhat isolated.  What other benefits and
common real-world situations would you use to talk me into switching to
Solr cores?

I'm guessing the replication works the same..

Thanks,
Mike


RE: C# API for Solr

2007-04-05 Thread Mike Austin
I would be very interested in this. Any idea on when this will be available?

Thanks

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Monday, April 02, 2007 1:44 AM
To: solr-user@lucene.apache.org
Subject: Re: C# API for Solr


Well, i think there will be a lot of people who will be very happy with 
this C# client.

grts,m 




"Jeff Rodenburg" <[EMAIL PROTECTED]> 
31/03/2007 18:00
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
C# API for Solr






We built our first search system architecture around Lucene.Net back in 
2005
and continued to make modifications through 2006.  We quickly learned that
search management is so much more than query algorithms and indexing
choices.  We were not readily prepared for the operational overhead that 
our
Lucene-based search required: always-on availability, fast response times,
batch and real-time updates, etc.

Fast forward to 2007.  Our front-end is Microsoft-based, but we needed to
support parallel development on non-Microsoft architecture, and thus 
needed
a cross-platform search system.  Hello Solr!  We've transitioned our 
search
system to Solr with a Linux/Tomcat back-end, and it's been a champ.  We 
now
use solr not only for standard keyword search, but also to drive queries 
for
lots of different content sections on our site.  Solr has moved beyond
mission critical in our operation.

As we've proceeded, we've built out a nice C# client library to abstract 
the
interaction from C# to Solr.  It's mostly generic and designed for

extensibilty.  With a few modifications, this could be a stand-alone 
library
that works for others.

I have clearance from the organization to contribute our library to the
community if there's interest.  I'd first like to gauge the interest of
everyone before doing so; please reply if you do.

cheers,
jeff r.




Solr index updating pattern

2007-04-25 Thread Mike Austin
Could someone give advise on a better way to do this?

I have an index of many merchants and each day I delete merchant products
and re-update my database. After doing this I than re-create the entire
index and move it to production replacing the current index.

I was thinking about updating the index in realtime with only products that
need updated. My concern is that I might be updating 2 million products,
deleting 1 million, and inserting another 1-2 million all in one process. I
guess I could send batches of files to be sucked in and processed but it's
just not as clean as just creating a new index. Do you see an issue with
these massive updates, deletes, and inserts in solr? The problem now is that
I might just be updating 1/2 or 1/4 of the index and I don't need to
re-re-create the entire index again.

What do some of you keep your index updated?  I'm running it off of windows
server so I haven't even looked into the snappuller etc.. stuff.

Thanks,
Mike



RE: Solrsharp feedback

2007-04-26 Thread Mike Austin
Jeff,

I reviewed your code a few days ago (and again today after seeing your
email) and it looks good. I'm not as interested in the Query and Results
namespaces since I pushed most of my facet and query code into the solr
servlet. However I would like to use the Update, Indexing, and Configuration
classes since they are more clean and flexible than mine :).

The problem is that I'm not ready to refactor anything right now since I
have to get some things out soon. I would probably be ready to integrate
SolrSharp in a couple/few weeks though and give some feedback.

Thanks for the work. I might actually be able to contribute some code to
this at some point... maybe in conjunction with my solr servlet code and how
I do faceting and category navigation.

Thanks,
Mike



-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 26, 2007 2:13 PM
To: solr-user@lucene.apache.org
Subject: Re: Solrsharp feedback


Hi Jeff,
Ah, smells like the same problem that Lucene.net is having - the lack of
people with interest in C# here at ASF. :(
I'm BCC-ing somebody who might be interested in looking at your C# client
for Solr.

Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Jeff Rodenburg <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, April 24, 2007 11:42:11 PM
Subject: Solrsharp feedback

I sent a few messages to the list about Solrsharp, the C# library for
working with Solr, a couple of weeks ago.  This was the first iteration of
the library and something I expected to see modified as others got a chance
to review it.  I've not heard any feedback since then, though.

For those that have checked out the code, is it working for you?  Does it
make sense?

thanks,
jeff r.





PriceJunkie.com using solr!

2007-05-16 Thread Mike Austin

I just wanted to say thanks to everyone for the creation of solr.  I've been
using it for a while now and I have recently brought one of my side projects
online.  I have several other projects that will be using solr for it's
search and facets.

Please check out www.pricejunkie.com and let us know what you think.. You
can give feedback and/or sign up on the mailing list for future updates.
The site is very basic right now and many new and useful features plus
merchants and product categories will be coming soon!  I thought it would be
a good idea to at least have a few people use it to get some feedback early
and often.

Some of the nice things behind the scenes that we did with solr:
- created custom request handlers that have category to facet to attribute
caching built in
- category to facet management
- ability to manage facet groups (attributes within a set facet) and 
assign
them to categories
- ability to create any category structure and share facet groups

- facet inheritance for any category (a facet group can be defined on a
parent category and pushed down to all children)
- ability to create sub-categories as facets instead of normal sub
categories
- simple xml configuration for the final outputted category configuration
file


I'm sure there are more cool things but that is all for now.  Join the
mailing list to see more improvements in the future.

Also.. how do I get added to the Using Solr wiki page?


Thanks,
Mike Austin



RE: PriceJunkie.com using solr!

2007-05-23 Thread Mike Austin
Thanks Tim.  Yes, the results are transformed with xslt in the .net page
after getting the response xml back from the handler.  So, SOLR is called
via http from the very light .net page.  Besides the transform, SOLR does
all of the work with the help of some of my custom classes that utilizes a
generated xml config file that has category to facet to query item type
mappings.


 -Original Message-
From: Tim Archambault [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 17, 2007 11:25 AM
To: solr-user@lucene.apache.org; [EMAIL PROTECTED]
Subject: Re: PriceJunkie.com using solr!


  I did a search and noticed pages were executed through aspx. Are you using
.net to parse the xml results from SOLR? Nice site, just trying to figure
out where SOLR fits into this.


  On 5/16/07, Mike Austin <[EMAIL PROTECTED]> wrote:

I just wanted to say thanks to everyone for the creation of solr.  I've
been
using it for a while now and I have recently brought one of my side
projects
online.  I have several other projects that will be using solr for it's
search and facets.

Please check out www.pricejunkie.com and let us know what you think..
You
can give feedback and/or sign up on the mailing list for future updates.
The site is very basic right now and many new and useful features plus
merchants and product categories will be coming soon!  I thought it
would be
a good idea to at least have a few people use it to get some feedback
early
and often.

Some of the nice things behind the scenes that we did with solr:
- created custom request handlers that have category to facet to
attribute
caching built in
- category to facet management
- ability to manage facet groups (attributes within a set facet)
and assign
them to categories
- ability to create any category structure and share facet
groups

- facet inheritance for any category (a facet group can be defined on a
parent category and pushed down to all children)
- ability to create sub-categories as facets instead of normal sub
categories
- simple xml configuration for the final outputted category
configuration
file


I'm sure there are more cool things but that is all for now.  Join the
mailing list to see more improvements in the future.

Also.. how do I get added to the Using Solr wiki page?


Thanks,
Mike Austin





RE: PriceJunkie.com using solr!

2007-05-23 Thread Mike Austin
Just one.

-Original Message-
From: James liu [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 16, 2007 10:30 PM
To: solr-user@lucene.apache.org
Subject: Re: PriceJunkie.com using solr!


how many solr instance?


2007/5/17, Yonik Seeley <[EMAIL PROTECTED]>:
>
> Congrats, very nice job!
> It's fast too.
>
> -Yonik
>
> On 5/16/07, Mike Austin <[EMAIL PROTECTED]> wrote:
> > I just wanted to say thanks to everyone for the creation of solr.  I've
> been
> > using it for a while now and I have recently brought one of my side
> projects
> > online.  I have several other projects that will be using solr for it's
> > search and facets.
> >
> > Please check out www.pricejunkie.com and let us know what you think..
> You
> > can give feedback and/or sign up on the mailing list for future updates.
> > The site is very basic right now and many new and useful features plus
> > merchants and product categories will be coming soon!  I thought it
> would be
> > a good idea to at least have a few people use it to get some feedback
> early
> > and often.
> >
> > Some of the nice things behind the scenes that we did with solr:
> > - created custom request handlers that have category to facet to
> attribute
> > caching built in
> > - category to facet management
> > - ability to manage facet groups (attributes within a set facet)
> and assign
> > them to categories
> > - ability to create any category structure and share facet
> groups
> >
> > - facet inheritance for any category (a facet group can be defined on a
> > parent category and pushed down to all children)
> > - ability to create sub-categories as facets instead of normal sub
> > categories
> > - simple xml configuration for the final outputted category
> configuration
> > file
> >
> >
> > I'm sure there are more cool things but that is all for now.  Join the
> > mailing list to see more improvements in the future.
> >
> > Also.. how do I get added to the Using Solr wiki page?
> >
> >
> > Thanks,
> > Mike Austin
>



--
regards
jl



RE: Faceted Search!

2007-06-20 Thread Mike Austin
Niraj: What environment are you using? SQL Server/.NET/Windows? or something
else?

-Mike

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 20, 2007 4:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Faceted Search!



: define the sub-categories.  let's say from the above example, the
: category "price" has different sub-categories like "less than 100"
: ,"100-200"?  I'm guessing, we explicit define this in XML feed file, but
: I could be very wrong.  In any case, can you please give me the short
: example achieve that implementation.  Well, thanks once again.

there's nothing "out of the box" from Solrthat will do this, it's
something you would need to implement either in the lcient or in a custom
request handler ... Solr's "Simple Faceting" support is esigned to be just
that: simple.  but the underlying methods/mechanisms of computing DocSet
intersetions can be used by any custom requets handler to generate
application specific results.

I've got 3 or 4 indexes that use the out of the box SimpleFacet support
Solr provides, but the major faceting we do (product based facets) all
uses custom request handlers so we can have very exact control on all of
this kind of stuff driven by our data management tools.



-Hoss



solr setup

2006-03-20 Thread Mike Austin
I'm trying to set solr up with CentOS 4.2, Apache 2.0.55, Tomcat 5, and Java
SDK 1.5 for the first time.

I copied the solr.war to the tomcat webapps folder and it created the solr
folders. I then try running the app with
http://localhost:8080/solr/adminand I get an error (I don't have the
error message now, I can get it later
tonight if needed). Is there some other step besides just copying the war
file?

BTW, the tomcat example apps work find.

Thanks,
Mike


Re: solr setup

2006-03-20 Thread Mike Austin
Thanks Yonik,

I fixed the conf issue.. now I get this. Any ideas?

2006-03-20 20:42:09 StandardWrapperValve[jsp]: Servlet.service() for
servlet jsp threw exception
java.lang.NoClassDefFoundError
at org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:67)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:324)
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214)
at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at 
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:198)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152)
at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)
at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118)
at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)
at 
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:799)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)
at 
org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)
at java.lang.Thread.run(Thread.java:595)

On 3/20/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 3/20/06, Mike Austin <[EMAIL PROTECTED]> wrote:
> > I'm trying to set solr up with CentOS 4.2, Apache 2.0.55, Tomcat 5, and
> Java
> > SDK 1.5 for the first time.
> >
> > I copied the solr.war to the tomcat webapps folder and it created the solr
> > folders. I then try running the app with
> > http://localhost:8080/solr/adminand I get an error (I don't have the
> > error message now, I can get it later
> > tonight if needed). Is there some other step besides just copying the war
> > file?
>
> Hi Mike,
> Solr needs to find it's config files.  Check out the "example"
> directory of the solr distribution you downloaded.  Solr currenty
> checks the ./solrconf/ directory for it's config, but that may soon
> change to ./solr/conf due to discussions on solr-dev.
>
> -Yonik
>


Re: solr setup

2006-03-20 Thread Mike Austin
Actually.. it looks like it is still not finding solrconfig.xml because after I 
restart tomcat I get the config file error. Where should this go again? I know 
you said ./solrconf, but relative to what? 

Also, I still don't know if I deployed the site correctly..the only thing I did 
was copy the solr.war to the tomcat/webapps folder.

Thanks
 


Re: solr setup

2006-03-20 Thread Mike Austin
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:287)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:425)
- Root Cause -
java.lang.ExceptionInInitializerError
at org.apache.solr.update.SolrIndexConfig.(Unknown Source)
at org.apache.solr.core.SolrCore.(Unknown Source)
at org.apache.solr.servlet.SolrServlet.init(Unknown Source)
at javax.servlet.GenericServlet.init(GenericServlet.java:211)
at 
org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1029)
at 
org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:862)
at 
org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4013)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4357)
at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:823)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:807)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:595)
at 
org.apache.catalina.core.StandardHostDeployer.install(StandardHostDeployer.java:277)
at org.apache.catalina.core.StandardHost.install(StandardHost.java:832)
at 
org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:625)
at 
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:431)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:983)
at 
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:349)
at 
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1091)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:789)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1083)
at 
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:478)
at 
org.apache.catalina.core.StandardService.start(StandardService.java:480)
at 
org.apache.catalina.core.StandardServer.start(StandardServer.java:2313)
at org.apache.catalina.startup.Catalina.start(Catalina.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:287)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:425)
Caused by: java.lang.RuntimeException: Can't find resource solrconfig.xml
at org.apache.solr.core.Config.openResource(Unknown Source)
at org.apache.solr.core.SolrConfig.(Unknown Source)
... 31 more

2006-03-20 22:37:06
StandardContext[/servlets-examples]ContextListener:
contextInitialized()
2006-03-20 22:37:06
StandardContext[/servlets-examples]SessionListener:
contextInitialized()
2006-03-20 22:37:08 StandardContext[/jsp-examples]ContextListener:
contextInitialized()
2006-03-20 22:37:08 StandardContext[/jsp-examples]SessionListener:
contextInitialized()

On 3/20/06, Mike Austin <[EMAIL PROTECTED]> wrote:
> Actually.. it looks like it is still not finding solrconfig.xml because
> after I restart tomcat I get the config file error. Where should this go
> again? I know you said ./solrconf, but relative to what?
>
> Also, I still don't know if I deployed the site correctly..the only thing I
> did was copy the solr.war to the tomcat/webapps folder.
>
> Thanks
>
>


Re: solr setup

2006-03-20 Thread Mike Austin
Ahhh!!  OK.. next time you see me you can back-slap me. I was doing a cd
into bin and starting tomcat. Now it is working. Sorry to waste your time,
it was my mistake all along. I did install tomcat 5.5 but the issue was the
startup.

Thanks,
Mike


Re: solr setup

2006-03-28 Thread Mike Austin
Try starting Tomcat from your /var/lib/tomcat5/ folder. While in that
folder, run "/etc/init.d/tomcat5 start".  I think I had a similar issue and
it was because I started tomcat in the wrong folder.

mike


Run solr on windows with IIS

2006-04-07 Thread Mike Austin
I know that I asked something like this before, but...

I read that you need cygwin for shell support, but is that just for the cmd
line post.sh support? I would like to run ASP.NET apps that use solr as the
search platform(on the same server for now). So, can I run IIS and
solr/servlets together? Any drawbacks or limitations that I might run into?
What should I use as the servlet engine? Apache?

Thanks!
mike


Re: Run solr on windows with IIS

2006-04-07 Thread Mike Austin
When is the replication part done or what is it used for? I need to get more
familar with that.

What do you mean by hard links and rsync?

>From what I read it is ok to run tomcat and IIS together, you just need to
have a connector for certain things. However, if I call the servlets from my
asp page specifying the port (8080), it seems that might be ok. I will try
it tonight.

On 4/7/06, Bill Au <[EMAIL PROTECTED]> wrote:
>
> The Solr server itself requires Java 1.5 and an application server which
> support Servlet 2.4.
> I know nothing about ASP.NET so I don't know if that supports Servlet 2.4or
> not.
>
> Shell support is needed for replication, which is done by a bunch of shell
> scripts.
> The current implemenation for replication also requires an OS with the
> ability to create hard links and rsync.
>
> Bill
>
> On 4/7/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> >
> > Mike,
> >
> > I recommend you use either Tomcat or Jetty.  Personally, I am just
> > developing with the example application (thanks again to the Solr
> > creators for a nicely done example app) which has Jetty embedded.  I
> > took the example app directory, copied it to my project, tweaked the
> > configuration, added it to version control, removed the conf/solr/
> > data directory, and run:
> >
> > java -jar start.jar
> >
> > Voila.  You may need to adjust the port that it runs on depending on
> > your system, but probably not.
> >
> > From ASP.NET, simply use RESTful HTTP GET/POST commands to the Solr
> > application server.  What a dream!
> >
> > Erik
> >
> >
> > On Apr 7, 2006, at 3:39 PM, Mike Austin wrote:
> >
> > > I know that I asked something like this before, but...
> > >
> > > I read that you need cygwin for shell support, but is that just for
> > > the cmd
> > > line post.sh support? I would like to run ASP.NET apps that use
> > > solr as the
> > > search platform(on the same server for now). So, can I run IIS and
> > > solr/servlets together? Any drawbacks or limitations that I might
> > > run into?
> > > What should I use as the servlet engine? Apache?
> > >
> > > Thanks!
> > > mike
> >
> >
>
>


Adding xml to SolrQueryResponse

2006-05-01 Thread Mike Austin

Is there a way to add attributes besides name to an xml node returned from
SolrQueryResponse? I've looked at the SolrQueryResponse.add and it looks
like a NamedList is my only option.  I know that I can get by with nodes
that have only the name attribute but it would make life a little easier to
throw some more attributes on a node.

Thanks,
Mike Austin


Use of caches.

2006-05-14 Thread Mike Austin

I was just reading threads about the use of caches.

OK.. I have about 100 categories that I want to bitwise AND together with
different searches to get category counts.  After the user selects a
category, they will be using other facets based on the particular category.
I was planning to keep my own structure of bitsets for the categories and
let solr handle all other caching with the default filterCache.  My question
is: Should I keep my own structure for the 100 bitsets because I always want
these to be around or should I do it another way with a solr defined user
cache?

Thanks


Price facet counts based on price[] query or text field

2006-09-27 Thread Mike Austin

I'm trying to figure out the best way to do filters based on price.  I've
been thinking that I would do it based on the query such as price[10-20],
however I wanted opinions on whether it would be better to do it based on
pre-filtering the item into a price range and searching for the string that
represents the price range in a multivalued text field in the schema (beside
other filtering values)? I was thinking that pre-filtering or grouping items
into the respected price range would be more work operationally but might
pay off with better performance. It would limit my searches to only the
predefined price ranges, however I don't know if that is a big problem.

Maybe some info on how other people do this and why would be helpful?

Thanks,
Mike


Re: Sorting

2006-10-11 Thread Mike Austin

Let me back up.. for a second. I want to create price ranges. I was thinking
that I would do a search with a sort on price and create ranges by getting
the document price every (docCount / #ofpricerangesIwant). Basically create:
< 10, 10 - 60, 60 - 100 etc.. If the initial search wasn't sorted by price
then I would have to do the second search just to figure out the price
ranges.

This was the only way I could think to do it. Maybe I'm going at this the
wrong way?

Thanks

On 10/11/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:



: I need to sort a query two ways. Should I do the search one way:
: s.getDocListAndSet(query, restrictions, sort, req.getStart(),
: req.getLimit(), flags);
: then do the same search again with a different sort value or is there a
: method available to just sort the DocSet (like sortDocSet but it's
: protected)
:
: OR maybe it doesn't  matter because caching will handle it anyway?

check this out from the example solrconfig.xml...

  
   true

...in those conditions, you should be able to just call getDocList (or
getDocListAndSet) with your various Sort options and the cache will take
care of everything.

if you *do* want scores to be included in one of the Sorts, then i would
try doing that search first using getDocListAndSet -- you can ignore the
DocSet, but the next call to getDocList should leverage the filterCache,
and the initial getDocListANdSet call hsould be faster then two seperate
getDocList calls with different sorts...

   ...i think.




-Hoss




Re: Sorting

2006-10-13 Thread Mike Austin


that's certianly one way to do it ... it would probably be faster though
to use the TermEnum of the price field directly.



I will look into this.


i've yet to really see a good appraoch to progomaticaly

determining (non-trivial) numeric ranges, personally i think that to have
good looking ranges you pretty much have to have them picked by a person
and stored in metadata.



I have thought about just doing it this way also. I originally did it this
way but it would be nice to have different buckets depending on the result
set that you have in the category
vs.
using the same buckets for any result set in a category even though they
don't make as much sense anymore depending on the facets selected.

i had some comments on this in discussion a

little while back...

http://www.nabble.com/forum/ViewPost.jtp?post=3753053&framed=y



I will have to read through this again but it looks like a good discussion.

Thanks.


Re: Sorting

2006-10-13 Thread Mike Austin


that's a very slippery slop .. i suspect a lot of users would be put-off
by ranges that kept changing on them as they added/removed other facet
constraints -- one minute you are seeing ranges like 0-10,11-20,20-30 and
then you say you are only interested in red products and now your ranges
are 0-12,23-20,21-30 ... ? ... that might be a little confusing.



Very good point.




picked by a person and stored in metadata



This is my current design.. I think I will stick with this after your
comments, thinking about it more, and the troubles that I'm having to get it
just right.. Plus, the inflexibility by not allowing me to specify the label
in different ways like I do or can do now.  I'm trying to make it as
automated as possible being that I have limited human resources. However, I
did make it easy to update and regenerate my facet config files so it
wouldn't be bad now, plus the ability to use me own labels for certain
buckets is nice.

Thanks


searcher.numDocs OR

2006-10-23 Thread Mike Austin

How can I get a count like I do by using searcher.numDocs(query, docset) but
with it doing an OR operation? I want the number of documents that match a
OR b?

Thanks,
Mike


Re: searcher.numDocs OR

2006-10-23 Thread Mike Austin

I want to know... If I have a maindocset with many filters already, how many
more documents would I get if I added one more OR query to the maindocset.

I've been using numDocs to get the count if I wanted to narrow my maindocset
down because numDocs says that it must be in the maindocset AND in the new
query. But, I want to know how many docs would I have if I added a new OR
query? Instead of testing how much I would narrow the facets.. I want to
know how many documents would be added by a certain OR query.

I'm now looking into unionSize, intersectSize, and andNotSize...
Thanks..

On 10/23/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:


On 10/23/06, Mike Austin <[EMAIL PROTECTED]> wrote:
> How can I get a count like I do by using searcher.numDocs(query, docset)
but
> with it doing an OR operation? I want the number of documents that match
a
> OR b?

It's not clear to me what you are trying to calculate.
In code, If you just want the size of the union of two sets, use
DocSet.unionSize().
If you want the number of documents in a query that match a or b, then use

numDocs(query, getDocSet("a OR b"))
OR
numDocs(query, getDocSet("a").union(getDocSet("b")))

From the request handlers, the only thing you could do would be to
combine a OR b in a single facet query:

q=foo&facet.query=a OR b

It wouldn't result in optimal caching if a and b have many valid
combinations, but that could possibly be handled by a future
optimization in Solr that could decompose boolean queries int
getDocSet().

-Yonik



Re: New Feature: ${solr.home}/lib/ dir for "plugins"

2006-11-15 Thread Mike Austin

Very nice. This will help me also.  I will try this out and let you know how
it goes. (Windows XP with a custom request handler and some other custom
classes)