1, 2009 at 1:31 PM, Toke Eskildsen wrote:
> On Tue, 2009-06-30 at 22:59 +0200, Marcus Herou wrote:
> > The number of concurrent users today is insignficant but once we push
> > for the service we will get into trouble... I know that since even one
> > simple faceting query (whi
On Tue, Jun 30, 2009 at 11:29 AM, Uwe Schindler wrote:
> > On Mon, 2009-06-29 at 09:47 +0200, Marcus Herou wrote:
> > > Index size(and growing): 16Gx8 = 128G
> > > Doc size (data): 20k
> > > Num docs: 90M
> > > Num users: Few hundred but most critical i
cool BlogSpace:
http://blogsearch.tailsweep.com/showFeed.do?feedId=114799
Sorry not meaning to advertise but I could not help it :)
//Marcus
On Tue, Jun 30, 2009 at 10:49 AM, Toke Eskildsen
wrote:
> On Mon, 2009-06-29 at 09:47 +0200, Marcus Herou wrote:
> > Index size(and growing): 16
uestion:
Based on your findings what is the most challenging part to tune ? Sorting
or querying or what else?
//Marcus
>
>
>
>
>
>
> - Original Message
> > From: Marcus Herou
> > To: java-user@lucene.apache.org
> > Sent: Monday, 29 June, 2009 9:47:
nough tool that there isn't a terribly universal answer to this. We
> were a bit surprised to end up cpu-bound instead of disk i/o-bound, for
> instance, but we ended up taking an unusual path. YMMV.
>
> Marcus Herou wrote:
> > Hi. I think I need to be more specific.
> >
?
Please any hints would be appreciated since I am going to invest soon.
//Marcus
On Sat, Jun 27, 2009 at 12:00 AM, Marcus Herou
wrote:
> Hi.
>
> I currently have an index which is 16GB per machine (8 machines = 128GB)
> (data is stored externally, not in index) and is growing like c
o get most bang for the buck
with a limited (aren't we all limited?) budget.
Kindly
//Marcus
--
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/
Thank you Steve, now it's implementation time...
I'll be back :)
/M
On Fri, Apr 24, 2009 at 3:13 AM, Steven Bethard wrote:
> On 4/23/2009 2:42 PM, Marcus Herou wrote:
> > So what you basically are saying is that:
> >
> > 1. You have an index which contains data
Never mind of how to open the ParallellReader stuff (I am an idiot): RTFM:
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/ParallelReader.html
But the rest is of course interesting :)
/M
On Thu, Apr 23, 2009 at 11:42 PM, Marcus Herou
wrote:
> Thanks! (I started my reply
another update interval than the PR interval.
2. A PR index which is rebuilt (from scratch ?) every X days/weeks/months.
Commenting inline.
Cheers and thanks for everything.
//Marcus
On Thu, Apr 23, 2009 at 11:28 PM, Steven Bethard wrote:
> On 4/23/2009 2:08 PM, Marcus Herou wrote:
>
But perhaps one could use a FieldCache somehow ?
/M
On Thu, Apr 23, 2009 at 11:07 PM, Marcus Herou
wrote:
> Yes I have considered it for 30 minutes :)
>
> How do one apply that in the real world ?
>
> If the only thing I get access to is the actual docId would it not be
> r
owto
> implement
> > that. Do you ?
> >
>
> Have you considered CustomScoreQuery in o.a.l.search.function ? It should
> allow
> incorporating external scores.
>
> Doron
>
--
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/
_4_1/api/org/apache/lucene/search/package-summary.html
>
> Also, I seem to have gotten very little response to my questions here,
> perhaps because they are asking about the expert interfaces? Is there a
> better place to ask such questions?
>
> Thanks,
>
> Steve
>
> -
Could an ExternalFileField help me ?
http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
On Thu, Apr 23, 2009 at 10:01 PM, Marcus Herou
wrote:
> Hi.
>
> Confusing subject eh ? Trying to become a little clearer in a few
> sentences.
>
> We have a
s ? Would there be a
possibilty to do some kind of join (parallell searches separate index types)
? or send the result to a separate sorting algorithm ? Hmmm Perhaps a
subclass of Sort ? Grasping at straws here folks...
Hope anyone of the core experts can help us.
Cheers
//Marcus Herou
--
Mar
t field, another sort is applied. Thats not
> ideal IMO, but its a start.
>
> - Mark
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.ap
gnitude or more.
>
> You're loading the document each time through the loop. I think you'd get
> much better
> performance by making sure that your groupField is indexed, then use
> TermDocs (TermEnum?)
> to get the value of the field.
>
> Best
> Erick
>
>
Oh bytw, faceting is easy it's the distinct part I think is hard.
Example Lucene Facet:
http://sujitpal.blogspot.com/2007/04/lucene-search-within-search-with.html
On Wed, Jan 28, 2009 at 12:43 PM, Marcus Herou
wrote:
> Hi.
>
> This is way too slow I think since what you are
ument of this group into the result list.
> >
> > Regards,
> > Nina
> >
>
> --
> View this message in context:
> http://www.nabble.com/Group-by-in-Lucene---tp13581760p21702742.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
on top of lucene.
>
> I was thinking about using a HitCollector to get only one result per group.
>
> How did you do it?
>
> Cheers,
> Nina
>
>
>
> Marcus Herou-2 wrote:
> >
> > Cool.
> >
> > I'll do since this is a field which I can spe
:00:00 |
| 26875 | 2008-08-23 00:00:00 |
| 27356 | 2008-08-24 00:00:00 |
| 33438 | 2008-08-25 00:00:00 |
| 33102 | 2008-08-26 00:00:00 |
| 31720 | 2008-08-27 00:00:00 |
| 26133 | 2008-08-28 00:00:00 |
| 22781 | 2008-08-29 00:00:00 |
| 20198 | 2008-08-30 00:00:00 |
|20 | 2008-08-31 00:00:00 |
+-
to the possibilities, just lack of resources :)
Kindly
//Marcus
On Mon, Sep 8, 2008 at 8:34 PM, Petite Abeille <[EMAIL PROTECTED]>wrote:
>
> On Sep 8, 2008, at 7:49 PM, Marcus Herou wrote:
>
> :) Whoof so much high quality info and at the same time a huge amount of
>>
:) Whoof so much high quality info and at the same time a huge amount of
useless data, splogs and spam.
/M
On Mon, Sep 8, 2008 at 7:38 PM, Petite Abeille <[EMAIL PROTECTED]>wrote:
>
> On Sep 8, 2008, at 6:43 AM, Marcus Herou wrote:
>
> the ShardedSolrDocumentIndexer will be
Hi guys.
Glad to announce that I finally managed to move this package out of the
company code and released it to the OS community. The package contains some
neat classes which we use for instance when indexing and searching through
some hundred thousands of blogs and the ShardedSolrDocumentIndexer
ion e)
>>{
>>e.printStackTrace();
>>error = true;
>> }
>> .
>> if (reader != null)
>>reader.close();
>> if (searcher != null)
>>searcher.close();
>>
>> BR,
>> Shawn
>>
>>
-------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com/
http://blogg.tailsweep.com/
Ah well some test classes could be appropriate:
http://dev.tailsweep.com/svn/abstractcache/trunk/src/test/java/org/tailsweep/abstractcache/test/
On Sun, Aug 3, 2008 at 5:07 PM, Marcus Herou <[EMAIL PROTECTED]>wrote:
> And for the heck of it I implemented a berkeleydb "java.util.Map
t;> > I am in desiging phase and i have time to explore and prototype any >
>>> other
>>> > products. Please do suggest me a good one.
>>> >
>>> > Regards
>>> > Ganesh
>>> >
>>> >
>>> >
>>>
>>&g
>
> > > I am in desiging phase and i have time to explore and prototype any
> other
> > > products. Please do suggest me a good one.
> > >
> > > Regards
> > > Ganesh
> > >
> > >
> > >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/Using-lucene-as-a-database...-good-idea-or-bad-idea--tp18703473p18754258.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
--
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com/
http://blogg.tailsweep.com/
k each). We
> > actually would NOT be expecting/needing Lucene's normal extreme fast text
> > search times for this, but we would need reasonable times for adding new
> > documents to the index, retrieving documents by ID (for iterating over
> all
> > docum
Cool.
I'll do since this is a field which I can spend time in.
Kindly
//Marcus
On 11/5/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
>
> On Nov 5, 2007, at 7:49 AM, Marcus Herou wrote:
>
> > Thanks. They seem to have got real far in the dev cycle on this.
>
e/SOLR-236
>
> -Grant
>
> On Nov 5, 2007, at 12:57 AM, Marcus Herou wrote:
>
> > Hi.
> >
> > I have a situation where I'm searching amongst some 100K feeds and
> > only want
> > one result per site in return. I have developed a really simple
>
. Does Lucene
> > provid
> > some sort of API to handle the above scenario's?
> >
> > Regards,
> > Sandeep.
>
> --
> Grant Ingersoll
> http://lucene.grantingersoll.com
>
> Lucene Boot Camp Training:
> ApacheCon Atla
ailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Marcus Herou Solution Architect & Core Java developer Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com
2.
Anyone?
Kindly
//Marcus
--
Marcus Herou Solution Architect & Core Java developer Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com
35 matches
Mail list logo