Re: Scaling out/up or a mix

2009-07-01 Thread Marcus Herou
1, 2009 at 1:31 PM, Toke Eskildsen wrote: > On Tue, 2009-06-30 at 22:59 +0200, Marcus Herou wrote: > > The number of concurrent users today is insignficant but once we push > > for the service we will get into trouble... I know that since even one > > simple faceting query (whi

Re: Scaling out/up or a mix

2009-06-30 Thread Marcus Herou
On Tue, Jun 30, 2009 at 11:29 AM, Uwe Schindler wrote: > > On Mon, 2009-06-29 at 09:47 +0200, Marcus Herou wrote: > > > Index size(and growing): 16Gx8 = 128G > > > Doc size (data): 20k > > > Num docs: 90M > > > Num users: Few hundred but most critical i

Re: Scaling out/up or a mix

2009-06-30 Thread Marcus Herou
cool BlogSpace: http://blogsearch.tailsweep.com/showFeed.do?feedId=114799 Sorry not meaning to advertise but I could not help it :) //Marcus On Tue, Jun 30, 2009 at 10:49 AM, Toke Eskildsen wrote: > On Mon, 2009-06-29 at 09:47 +0200, Marcus Herou wrote: > > Index size(and growing): 16

Re: Scaling out/up or a mix

2009-06-29 Thread Marcus Herou
uestion: Based on your findings what is the most challenging part to tune ? Sorting or querying or what else? //Marcus > > > > > > > - Original Message > > From: Marcus Herou > > To: java-user@lucene.apache.org > > Sent: Monday, 29 June, 2009 9:47:

Re: Scaling out/up or a mix

2009-06-29 Thread Marcus Herou
nough tool that there isn't a terribly universal answer to this. We > were a bit surprised to end up cpu-bound instead of disk i/o-bound, for > instance, but we ended up taking an unusual path. YMMV. > > Marcus Herou wrote: > > Hi. I think I need to be more specific. > >

Re: Scaling out/up or a mix

2009-06-28 Thread Marcus Herou
? Please any hints would be appreciated since I am going to invest soon. //Marcus On Sat, Jun 27, 2009 at 12:00 AM, Marcus Herou wrote: > Hi. > > I currently have an index which is 16GB per machine (8 machines = 128GB) > (data is stored externally, not in index) and is growing like c

Scaling out/up or a mix

2009-06-26 Thread Marcus Herou
o get most bang for the buck with a limited (aren't we all limited?) budget. Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/

Re: exponential boosts

2009-04-23 Thread Marcus Herou
Thank you Steve, now it's implementation time... I'll be back :) /M On Fri, Apr 24, 2009 at 3:13 AM, Steven Bethard wrote: > On 4/23/2009 2:42 PM, Marcus Herou wrote: > > So what you basically are saying is that: > > > > 1. You have an index which contains data

Re: exponential boosts

2009-04-23 Thread Marcus Herou
Never mind of how to open the ParallellReader stuff (I am an idiot): RTFM: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/ParallelReader.html But the rest is of course interesting :) /M On Thu, Apr 23, 2009 at 11:42 PM, Marcus Herou wrote: > Thanks! (I started my reply

Re: exponential boosts

2009-04-23 Thread Marcus Herou
another update interval than the PR interval. 2. A PR index which is rebuilt (from scratch ?) every X days/weeks/months. Commenting inline. Cheers and thanks for everything. //Marcus On Thu, Apr 23, 2009 at 11:28 PM, Steven Bethard wrote: > On 4/23/2009 2:08 PM, Marcus Herou wrote: >

Re: exponential boosts

2009-04-23 Thread Marcus Herou
But perhaps one could use a FieldCache somehow ? /M On Thu, Apr 23, 2009 at 11:07 PM, Marcus Herou wrote: > Yes I have considered it for 30 minutes :) > > How do one apply that in the real world ? > > If the only thing I get access to is the actual docId would it not be > r

Re: exponential boosts

2009-04-23 Thread Marcus Herou
owto > implement > > that. Do you ? > > > > Have you considered CustomScoreQuery in o.a.l.search.function ? It should > allow > incorporating external scores. > > Doron > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: exponential boosts

2009-04-23 Thread Marcus Herou
_4_1/api/org/apache/lucene/search/package-summary.html > > Also, I seem to have gotten very little response to my questions here, > perhaps because they are asking about the expert interfaces? Is there a > better place to ask such questions? > > Thanks, > > Steve > > -

Re: Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
Could an ExternalFileField help me ? http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html On Thu, Apr 23, 2009 at 10:01 PM, Marcus Herou wrote: > Hi. > > Confusing subject eh ? Trying to become a little clearer in a few > sentences. > > We have a

Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
s ? Would there be a possibilty to do some kind of join (parallell searches separate index types) ? or send the result to a separate sorting algorithm ? Hmmm Perhaps a subclass of Sort ? Grasping at straws here folks... Hope anyone of the core experts can help us. Cheers //Marcus Herou -- Mar

Re: Group by in Lucene ?

2009-02-01 Thread Marcus Herou
t field, another sort is applied. Thats not > ideal IMO, but its a start. > > - Mark > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.ap

Re: Group by in Lucene ?

2009-02-01 Thread Marcus Herou
gnitude or more. > > You're loading the document each time through the loop. I think you'd get > much better > performance by making sure that your groupField is indexed, then use > TermDocs (TermEnum?) > to get the value of the field. > > Best > Erick > >

Re: Group by in Lucene ?

2009-01-28 Thread Marcus Herou
Oh bytw, faceting is easy it's the distinct part I think is hard. Example Lucene Facet: http://sujitpal.blogspot.com/2007/04/lucene-search-within-search-with.html On Wed, Jan 28, 2009 at 12:43 PM, Marcus Herou wrote: > Hi. > > This is way too slow I think since what you are

Re: Group by in Lucene ?

2009-01-28 Thread Marcus Herou
ument of this group into the result list. > > > > Regards, > > Nina > > > > -- > View this message in context: > http://www.nabble.com/Group-by-in-Lucene---tp13581760p21702742.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > >

Re: Group by in Lucene ?

2009-01-27 Thread Marcus Herou
on top of lucene. > > I was thinking about using a HitCollector to get only one result per group. > > How did you do it? > > Cheers, > Nina > > > > Marcus Herou-2 wrote: > > > > Cool. > > > > I'll do since this is a field which I can spe

Buzz measurement - Aggregate functions

2008-10-10 Thread Marcus Herou
:00:00 | | 26875 | 2008-08-23 00:00:00 | | 27356 | 2008-08-24 00:00:00 | | 33438 | 2008-08-25 00:00:00 | | 33102 | 2008-08-26 00:00:00 | | 31720 | 2008-08-27 00:00:00 | | 26133 | 2008-08-28 00:00:00 | | 22781 | 2008-08-29 00:00:00 | | 20198 | 2008-08-30 00:00:00 | |20 | 2008-08-31 00:00:00 | +-

Re: Haloe (Lucene package) released!

2008-09-08 Thread Marcus Herou
to the possibilities, just lack of resources :) Kindly //Marcus On Mon, Sep 8, 2008 at 8:34 PM, Petite Abeille <[EMAIL PROTECTED]>wrote: > > On Sep 8, 2008, at 7:49 PM, Marcus Herou wrote: > > :) Whoof so much high quality info and at the same time a huge amount of >>

Re: Haloe (Lucene package) released!

2008-09-08 Thread Marcus Herou
:) Whoof so much high quality info and at the same time a huge amount of useless data, splogs and spam. /M On Mon, Sep 8, 2008 at 7:38 PM, Petite Abeille <[EMAIL PROTECTED]>wrote: > > On Sep 8, 2008, at 6:43 AM, Marcus Herou wrote: > > the ShardedSolrDocumentIndexer will be

Haloe (Lucene package) released!

2008-09-07 Thread Marcus Herou
Hi guys. Glad to announce that I finally managed to move this package out of the company code and released it to the OS community. The package contains some neat classes which we use for instance when indexing and searching through some hundred thousands of blogs and the ShardedSolrDocumentIndexer

Re: failed to open an indexer after about 20 queries

2008-08-06 Thread Marcus Herou
ion e) >>{ >>e.printStackTrace(); >>error = true; >> } >> . >> if (reader != null) >>reader.close(); >> if (searcher != null) >>searcher.close(); >> >> BR, >> Shawn >> >>

Re: failed to open an indexer after about 20 queries

2008-08-05 Thread Marcus Herou
------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: Using lucene as a database... good idea or bad idea?

2008-08-03 Thread Marcus Herou
Ah well some test classes could be appropriate: http://dev.tailsweep.com/svn/abstractcache/trunk/src/test/java/org/tailsweep/abstractcache/test/ On Sun, Aug 3, 2008 at 5:07 PM, Marcus Herou <[EMAIL PROTECTED]>wrote: > And for the heck of it I implemented a berkeleydb "java.util.Map

Re: Using lucene as a database... good idea or bad idea?

2008-08-03 Thread Marcus Herou
t;> > I am in desiging phase and i have time to explore and prototype any > >>> other >>> > products. Please do suggest me a good one. >>> > >>> > Regards >>> > Ganesh >>> > >>> > >>> > >>> >>&g

Re: Using lucene as a database... good idea or bad idea?

2008-08-03 Thread Marcus Herou
> > > > I am in desiging phase and i have time to explore and prototype any > other > > > products. Please do suggest me a good one. > > > > > > Regards > > > Ganesh > > > > > > > > > > > > > -- > > View this message in context: > > > http://www.nabble.com/Using-lucene-as-a-database...-good-idea-or-bad-idea--tp18703473p18754258.html > > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: Using lucene as a database... good idea or bad idea?

2008-08-03 Thread Marcus Herou
k each). We > > actually would NOT be expecting/needing Lucene's normal extreme fast text > > search times for this, but we would need reasonable times for adding new > > documents to the index, retrieving documents by ID (for iterating over > all > > docum

Re: Group by in Lucene ?

2007-11-06 Thread Marcus Herou
Cool. I'll do since this is a field which I can spend time in. Kindly //Marcus On 11/5/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > > On Nov 5, 2007, at 7:49 AM, Marcus Herou wrote: > > > Thanks. They seem to have got real far in the dev cycle on this. >

Re: Group by in Lucene ?

2007-11-05 Thread Marcus Herou
e/SOLR-236 > > -Grant > > On Nov 5, 2007, at 12:57 AM, Marcus Herou wrote: > > > Hi. > > > > I have a situation where I'm searching amongst some 100K feeds and > > only want > > one result per site in return. I have developed a really simple >

Re: How do we limit the growth of a Lucene Index?

2007-11-05 Thread Marcus Herou
. Does Lucene > > provid > > some sort of API to handle the above scenario's? > > > > Regards, > > Sandeep. > > -- > Grant Ingersoll > http://lucene.grantingersoll.com > > Lucene Boot Camp Training: > ApacheCon Atla

Re: Does someone know how to sort the hits list by a specified document field?

2007-11-05 Thread Marcus Herou
ailing list archive at Nabble.com. > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Marcus Herou Solution Architect & Core Java developer Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com

Group by in Lucene ?

2007-11-04 Thread Marcus Herou
2. Anyone? Kindly //Marcus -- Marcus Herou Solution Architect & Core Java developer Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com