Thanks Hoss,
I probably did not formulate the question properly, but you gave me an answer.
I do it already in SearchComponent, just wanted to centralise this
control of the depth and width of the response to the single place in
code [style={minimal, verbose, full...}].
It just sounds logical t
Transformer is great to augment Documents before shipping to response,
but what would be a way to prevent document from being delivered?
I have some search components that make some conclusions after search
, duplicates removal, clustering and one Augmenter(solr Transformer)
to shape the response
I am here on lucene as a user since the project started, even before
solr came to life, many many years. And I was always using trunk
version for pretty big customers, and *never* experienced some serious
problems. The worst thing that can happen is to notice bug somewhere,
and if you have some rea
hmm, loks like you are facing exactly the phenomena I asked about.
See my question here:
http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/61326
On Sun, Mar 4, 2012 at 9:24 PM, Markus Jelsma
wrote:
> Hi,
>
> With auto-committing disabled we can now index many millions of documents in
that gets
> flushed depending on the requests coming through and the buffer size.
>
> - Mark Miller
> lucidimagination.com
>
> On Feb 28, 2012, at 3:38 AM, eks dev wrote:
>
>> SolrCluod is going to be great, NRT feature is really huge step
>> forward, as well as centra
SolrCluod is going to be great, NRT feature is really huge step
forward, as well as central configuration, elasticity ...
The only thing I do not yet understand is treatment of cases that were
traditionally covered by Master/Slave setup. Batch update
If I get it right (?), updates to replicas are
thin.
On Thu, Feb 23, 2012 at 8:47 AM, eks dev wrote:
> thanks Mark, I will give it a go and report back...
>
> On Thu, Feb 23, 2012 at 1:31 AM, Mark Miller wrote:
>> Looks like an issue around replication IndexWriter reboot, soft commits and
>> hard commits.
>>
>>
te our commit point to the right dir
> solrCore.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
>
> That should allow the searcher that the following commit command prompts to
> see the *new* IndexWriter.
>
> On Feb 22, 2012, at 10:56 AM, eks dev wrote:
>
>> W
out of curiosity, trying to see if new cloud features can replace what
I use now...
how is this (batch) update forwarding solved at cloud level?
imagine simple one shard and one replica case, if I fire up DIH
update, is this going to be replicated to replica shard?
If yes,
- is it going to be sen
Davon, you ought to try to update from many threads, (I do not know if
DIH can do it, check it), but lucene does great job if fed from many
update threads...
depends where your time gets lost, but it is usually a) analysis chain
or b) database
if it os a) and your server has spare cpu-cores, you
We started observing strange failures from ReplicationHandler when we
commit on master trunk version 4-5 days old.
It works sometimes, and sometimes not didn't dig deeper yet.
Looks like the real culprit hides behind:
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is clos
with the master.
>
> What are you expecting a BeforeCommitListener could do for you, if one
> would exist?
>
> Kind regards,
> Em
>
> Am 21.02.2012 21:10, schrieb eks dev:
>> Thanks Mark,
>> Hmm, I would like to have this information asap, not to wait until the
And drinks on me to those who decoupled implicit commit from close...
this was tricky trap
On Tue, Feb 21, 2012 at 9:10 PM, eks dev wrote:
> Thanks Mark,
> Hmm, I would like to have this information asap, not to wait until the
> first search gets executed (depends on user) . Is solr
licates can appear) are there any "IndexWriter"
listeners around?
Thanks again,
eks.
On Tue, Feb 21, 2012 at 8:03 PM, Mark Miller wrote:
> Post commit calls are made before a new searcher is opened.
>
> Might be easier to try to hook in with a new searcher listener?
>
Hi all,
I am a bit confused with IndexSearcher refresh lifecycles...
In a master slave setup, I override postCommit listener on slave
(solr trunk version) to read some user information stored in
userCommitData on master
--
@Override
public final void postCommit() {
// This returnes "stale"
Thanks Robert,
I've missed LUCENE-3490... Awesome!
On Sun, Dec 11, 2011 at 6:37 PM, Robert Muir wrote:
> On Sun, Dec 11, 2011 at 11:34 AM, eks dev wrote:
>> on the latest trunk, my schema.xml with field type declaration
>> containing //codec="Pulsing"//
on the latest trunk, my schema.xml with field type declaration
containing //codec="Pulsing"// does not work any more (throws
exception from FieldType). It used to work wit approx. a month old
trunk version.
I didn't dig deeper, can be that the old schema.xml was broken and
worked by accident.
--
Re. "I have little experience with VM servers for search."
We had huge performance penalty on VMs, CPU was bottleneck.
We couldn't freely run measurements to figure out what the problem really
was (hosting was contracted by customer...), but it was something pretty
scary, kind of 8-10 times slowe
Just to bring closure on this one, we were slurping data from the
wrong DB (hardly desktop class machine)...
Solr did not cough on 41Mio records @34k updates / sec., single threaded.
Great!
On Sat, Sep 24, 2011 at 9:18 PM, eks dev wrote:
> just looking for hints where to look for...
>
g locally
>
> Out of curiosity, how big is your ramBufferSizeMB and your -Xmx?
> And on that 8-core box you have ~8 indexing threads going?
>
> Otis
>
> Sematext is Hiring -- http://sematext.com/about/jobs.html
>
>
>
>
>>
&g
just looking for hints where to look for...
We were testing single threaded ingest rate on solr, trunk version on
atypical collection (a lot of small documents), and we noticed
something we are not able to explain.
Setup:
We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD,
machi
probably stupid question,
Which Directory implementation should be the best suited for index
mounted on ramfs/tmpfs? I guess plain old FSDirectory, (or mmap/nio?)
probably stupid question,
Which Directory implementation should be the best suited for index
mounted on ramfs/tmpfs? I guess plain old FSDirectory, (or mmap/nio?)
watch out, "running 10 hours" != "idling 10 seconds" and trying again.
Those are different cases.
It is not dropping *used* connections (good to know it works that
good, thanks for reporting!), just not reusing connections more than
10 seconds idle
On Fri, Sep 2, 2011 at 10:26 PM, Gora Mohanty
take care, "running 10 hours" != "idling 10 seconds" and trying again.
Those are different cases.
It is not dropping *used* connections (good to know it works that
good, thanks for reporting!), just not reusing connections more than
10 seconds idle
On Fri, Sep 2, 2011 at 10:26 PM, Gora Mohanty
I am not sure if current version has this, but DIH used to reload
connections after some idle time
if (currTime - connLastUsed > CONN_TIME_OUT) {
synchronized (this) {
Connection tmpConn = factory.call();
clos
Thinking aloud and grateful for sparing ..
I need to support high commit rate (low update latency) in a master
slave setup and I have a bad feelings about it, even with disabling
warmup and stripping everything down that slows down refresh.
I will try it anyway, but I started thinking about "back
ne to do, but I really do not
know simple and fast way...
cheers,
eks
On Sat, Aug 6, 2011 at 8:32 PM, Shawn Heisey wrote:
> On 8/6/2011 8:49 AM, eks dev wrote:
>>
>> I would appreciate some clarifications about DIH
>>
>> I do not have reliable timestamp, but I do have ato
I would appreciate some clarifications about DIH
I do not have reliable timestamp, but I do have atomic sequence that
only grows on inserts/changes.
You can understand it as a timestamp on some funky timezone not
related to wall clock time, it is integer type.
Is DIH keeping track of the MAX(comm
hey have yet to
> bump that to trunk/4.x; it was only recently updated to 3.2.
>
> On Aug 2, 2011, at 5:26 PM, eks dev wrote:
>
>> Well, Lucid released "LucidWorks Enterprise"
>> with " Complete Apache Solr 4.x Release Integrated and tested with
>> po
Well, Lucid released "LucidWorks Enterprise"
with " Complete Apache Solr 4.x Release Integrated and tested with
powerful enhancements"
Whatever it means for solr 4.0
On Tue, Aug 2, 2011 at 11:10 PM, David Smiley (@MITRE.org)
wrote:
> My best guess (and it is just a guess) is between December
On Wed, Jun 29, 2011 at 4:32 PM, eks dev wrote:
>> req.getSearcher().getFirstMatch(t) != -1;
>
> Yep, this is currently the fastest option we have.
>
> -Yonik
> http://www.lucidimagination.com
>
t 2:01 AM, eks dev wrote:
>
>> Quick question,
>> Is there a way with solr to conditionally update document on unique
>> id? Meaning, default, add behavior if id is not already in index and
>> *not to touch index" if already there.
>>
>> Deletes are no
011-06-29 at 09:35 +0200, eks dev wrote:
>> In MMAP, you need to have really smart warm up (MMAP) to beat IO
>> quirks, for RAMDir you need to tune gc(), choose your poison :)
>
> Other alternatives are operating system RAM disks (avoids the GC
> problem) and using SSDs (nearly
Wed, 2011-06-29 at 09:35 +0200, eks dev wrote:
>> In MMAP, you need to have really smart warm up (MMAP) to beat IO
>> quirks, for RAMDir you need to tune gc(), choose your poison :)
>
> Other alternatives are operating system RAM disks (avoids the GC
> problem) and using SSDs (nea
...Using RAMDirectory really does not help performance...
I kind of agree, but in my experience with lucene, there are cases
where RAMDirectory helps a lot, with all its drawbacks (huge heap and
gc() tuning).
We had very good experience with MMAP on average, but moving to
RAMDirectory with prop
Quick question,
Is there a way with solr to conditionally update document on unique
id? Meaning, default, add behavior if id is not already in index and
*not to touch index" if already there.
Deletes are not important (no sync issues).
I am asking because I noticed with deduplication turned on,
i
Quick question,
Is there a way with solr to conditionally update document on unique
id? Meaning, default, add behavior if id is not already in index and
*not to touch index" if already there.
Deletes are not important (no sync issues).
I am asking because I noticed with deduplication turned on,
i
Your best bet is MMapDirectoryFactory, you can come very close to the
performance of the RAMDirectory. Unfortunatelly this setup with
Master_on_disk->Slaves_in_ram type of setup is not possible using
solr.
We are moving our architecture to solr at the moment, and this is one
of "missings" we have
Thanks Hoss,
Externanlizing this part is exactly the path we are exploring now, not
only for this reason.
We already started testing Hadoop SequenceFile for write ahead log for
updates/deletes.
SequenceFile supports append now (simply great!). It was a a pain to
have to add hadoop into mix for "
Q1. Is is possible to pass *analyzed* content to the
public abstract class Signature {
public void init(SolrParams nl) { }
public abstract String calculate(String content);
}
Q2. Method calculate() is using concatenated fields from name,features,cat
Is there any mechanism I could build "fi
Hi,
Use case I am trying to figure out is about preserving IDs without
re-indexing on duplicate, rather adding this new ID under list of
document id "aliases".
Example:
Input collection:
"id":1, "text":"dummy text 1", "signature":"A"
"id":2, "text":"dummy text 1", "signature":"A"
I add the first
if your index is read-only in production, can you add mapping
unique_id-Lucene docId in your kv store and and build filters externally?
That would make unique Key obsolete in your production index, as you would
work at lucene doc id level.
That way, you offline the problem to update/optimize phase
43 matches
Mail list logo