On Nov 17, 2009, at 2:48 AM, Jay Hill wrote:
The replication admin page on slaves used to have an auto-reload set
to
reload every few seconds. In the official 1.4 release this doesn't
seem to
be working, but it does in a nightly build from early June. Was this
changed
on purpose or is this
If an index fits in memory, I am guessing you'll see the speed change roughly
proportionally to the size of the index. If an index does not fit into memory
(i.e. disk head has to run around the disk to look for info), then the
improvement will be even greater. I haven't explicitly tested this
Not that I know. It's not in contrib, but if you apply that patch from
http://wiki.apache.org/solr/SpatialSearch I am guessing it puts things in
contrib/spatial.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
The replication admin page on slaves used to have an auto-reload set to
reload every few seconds. In the official 1.4 release this doesn't seem to
be working, but it does in a nightly build from early June. Was this changed
on purpose or is this a bug? I looked through CHANGES.txt to see if anythin
I'm are planning out a system with large indexes and wondering what kind
of performance boost I'd see if I split out documents into many cores
rather than using a single core and splitting by a field. I've got about
500GB worth of indexes ranging from 100MB to 50GB each.
I'm assuming if we split
Sorry, I did not answer the question. Yes, that's right. SolrJ can
only change the documents in the index. It has no power over the
metadata.
On Mon, Nov 16, 2009 at 4:00 PM, yz5od2 wrote:
> thanks, so there is no way to create custom documents/field via the SolrJ
> client API @ runtime.?
>
>
>
thanks, so there is no way to create custom documents/field via the
SolrJ client API @ runtime.?
On Nov 16, 2009, at 4:49 PM, Lance Norskog wrote:
here is no way to create custom documents/fields
via the SolrJ client @ runtime.
If you search a 'date' field you have to give it a correct UTC formatted string.
You can have a copy of the date field into a string field, and search
the string. These will have to be pure UTC formatted strings also. The
directive can do this copying for you. The target field
does not have to be
Oh well. There is no direct feature for controlling what is copied.
If you use the DataImportHandler, you can include Java plugins or
Javascript/JRuby/Groovy code to do the copying.
On Sun, Nov 15, 2009 at 9:37 PM, Vicky_Dev
wrote:
>
> Thanks for response
>
> Defining field is not working :(
>
>
Solr includes a feature for wild-carding field names. Look for "*_s"
in schema.xml. You can create all of the fields you want but they need
a specific string pattern in the name. There are no restrictions on
the values.
It is a bad practice to create other indexes by hand and import them.
In fact,
Nice to learn a new word for the day!
But to answer your question, or at least part of it, I don't really think
you want a configuration like
1
10
Committing every doc, and every 10 milliseconds? That's just asking for
problems. How about starting with 1000 docs, and five minut
I think it would be useful for members of this list to realize that not
everyone uses the same metrology and terms.
It is very easy for "Americans" to use the imperial system and presume
everyone does the same; Europeans to use the metric system etc. Hopefully
members on this list would be persuad
On Mon, Nov 16, 2009 at 5:22 PM, Walter Underwood wrote:
> Probably "lakh": 100,000.
>
> So, 900k qpd and 3M docs.
>
> http://en.wikipedia.org/wiki/Lakh
>
> wunder
>
> On Nov 16, 2009, at 2:17 PM, Otis Gospodnetic wrote:
>
> > Hi,
> >
> > Your autoCommit settings are very aggressive. I'm guessing
Hi,
Lakh or Lac - 100,000
Crore - 100,00,000 (ten million)
Commonly used in India
Sincerely,
Sithu D Sudarsan
-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org]
Sent: Monday, November 16, 2009 5:22 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr - Loa
Probably "lakh": 100,000.
So, 900k qpd and 3M docs.
http://en.wikipedia.org/wiki/Lakh
wunder
On Nov 16, 2009, at 2:17 PM, Otis Gospodnetic wrote:
> Hi,
>
> Your autoCommit settings are very aggressive. I'm guessing that's what's
> causing the CPU load.
>
> btw. what is "laks"?
>
> Otis
>
Hi,
Your autoCommit settings are very aggressive. I'm guessing that's what's
causing the CPU load.
btw. what is "laks"?
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> From: kal
I'd have to verify this to be sure, but I *believe* deleted docs data is
expunged during index segment merges.
See
https://issues.apache.org/jira/browse/SOLR-1275
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
Hi Erik,
I didn't look at the source code, and I think the javadoc for SUSS doesn't
mention it, but I am under the impression that the number of threads to use
should roughly match the number of CPU cores on the master. The
maxWarmingSearchers should only be relevant to slaves, not masters, no
On Mon, Nov 16, 2009 at 2:49 PM, Pablo Ferrari wrote:
> Hello,
>
> I have an already working Solr service based un full imports connected via
> php to a Zend Framework MVC (I connect it directly to the Controller).
> I use the SolrClient class for php which is great:
> http://www.php.net/manual/en
Hello,
I have an already working Solr service based un full imports connected via
php to a Zend Framework MVC (I connect it directly to the Controller).
I use the SolrClient class for php which is great:
http://www.php.net/manual/en/class.solrclient.php
For now on, every time I want to edit a doc
There is a "text_rev" field type in the example schema.xml file in the
official release of 1.4. It uses the ReversedWildcardFilterFactory to revers
a field. You can do a copyField from the field you want to use for leading
wildcard searches to a field using the text_rev field, and then do a regular
On Mon, 2009-11-02 at 19:49 -0500, Paul Tomblin wrote:
> Here's what I'm thinking
>
> final static int MAX_ROWS = 100;
> int start = 0;
> query.setRows(MAX_ROWS);
> while (true)
> {
>QueryResponse resp = solrChunkServer.query(query);
>SolrDocumentList docs = resp.getResults();
>if (doc
My application updates the master index frequently, sometimes very frequently.
Is there a good rule of thumb for configuring:
1) maxWarmingSearchers in the master
2) the SUSS thread pool size (and perhaps queue length) to match the server
settings?
Hi,
I have added a deleted field in my database, and am using the
Dataimporthandler to add rows to the index...
I am using solr 1.4
I have added my the deleted field to the query and the RegexTransformer...
and the field definition below
When I run the deltaImport command... I see the below o
Localsolr is not in contrib yet. I am interested in knowing whether
currently there is a better solution for setting up a local search.
Cheers.
On Sun, Nov 15, 2009 at 9:25 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Nota bene:
> My understanding is the external versions of Loc
Hi,
I had working index time boosting on documents like so:
Everything was great until I made some changes that I thought where no
related to the doc boost but after that my doc boosting appears to be
missing.
I'm having a tough time debugging this and didn't have the sense to version
control t
William Pierce wrote:
> Folks:
>
> For those of your experienced linux-solr hands, I am seeking recommendations
> for which file system you think would work best with solr. We are currently
> running with Ubuntu 9.04 on an amazon ec2 instance. The default file system
> I think is ext3.
>
>
On Fri, Nov 13, 2009 at 11:45 PM, Lance Norskog wrote:
> I would go with polling Solr to find what is not yet there. In
> production, it is better to assume that things will break, and have
> backstop janitors that fix them. And then test those janitors
> regularly.
Good idea, Lance. I certainly
On Fri, Nov 13, 2009 at 11:02 PM, Otis Gospodnetic
wrote:
> So I think the question is really:
> "If I stop the servlet container, does Solr issue a commit in the shutdown
> hook in order to ensure all buffered docs are persisted to disk before the
> JVM exits".
Exactly right, Otis.
> I don't
Folks:
For those of your experienced linux-solr hands, I am seeking recommendations
for which file system you think would work best with solr. We are currently
running with Ubuntu 9.04 on an amazon ec2 instance. The default file system I
think is ext3.
I am of course seeking, of course, t
On Fri, Nov 13, 2009 at 4:09 PM, Chris Hostetter
wrote:
> please don't kill -9 ... it's grossly overkill, and doesn't give your
[ ... snip ... ]
> Alternately, you could take advantage of the "enabled" feature from your
> client (just have it test the enabled url ever N updates or so) and when
> i
Otis Gospodnetic wrote on 11/13/2009 11:15:43
PM:
> Let's take a step back. Why do you need to optimize? You said: "As
> long as I'm not optimizing, search and indexing times are
satisfactory." :)
>
> You don't need to optimize just because you are continuously adding
> and deleting documents
On Mon, Nov 16, 2009 at 5:55 PM, Mauricio Scheffer
wrote:
> Yep, I think I mostly nailed the unmarshalling. Need more tests though. And
> then integrate it to SolrNet.
> Is there any way (or are there any plans) to have an update handler that
> accepts javabin?
There is already one . look at Binar
On Mon, Nov 16, 2009 at 6:25 PM, amitj wrote:
>
> Is there also a way we can include some kind of annotation on the schema
> field and send the data retrieved for that field to an external application.
> We have a requirement where we require some data fields (out of the fields
> for an entity def
Hi All.
My server solr box cpu utilization increasing b/w 60 to 90% and
some time solr is getting down and we are restarting it manually.
No of documents in solr 30 laks.
No of add/update requrest solr 30 thousand / day. Avg of every 30
minutes around 500 writes.
No of search re
> By that I mean that the java/tomcat
> process just disappears.
I had similar problem when I started Tomcat via SSH, and then I improperly
closed SSH without "exit" command.
In some cases (OutOfMemory) memory is not enough to generate log (or CPU can
be overloaded by Garbage Collector to su
We'd like to share with the solr users a recent news item from http://sesat.no
Sesam has spent some three months migrating all its indexes from FAST to
Solr+Lucene.
It was a joyful experience and allowed us to implement a number of improvements
we never could under FAST.
We've written a review
Hi,
I'm newbie using Solr and I'd like to run some tests against our data set. I
have successful tested Solr + Cell using the standard Http Solr server
and now we need to test the Embedded solution and when a try to start the
embedded server i get this exception:
INFO: registering core:
Exception
Is there also a way we can include some kind of annotation on the schema
field and send the data retrieved for that field to an external application.
We have a requirement where we require some data fields (out of the fields
for an entity defined in data-config.xml) to act as entities for entity
e
Thank you for your reply.
I had the assumption Tika could also extract text content from various
documenttypes instead of only meta data. I'll use the CLI tools from
http://www.foolabs.com/xpdf/ to extract text manually.
-
Markus Jelsma Buyways B.V.
Technisch Architect
Yep, I think I mostly nailed the unmarshalling. Need more tests though. And
then integrate it to SolrNet.
Is there any way (or are there any plans) to have an update handler that
accepts javabin?
2009/11/16 Noble Paul നോബിള് नोब्ळ्
> start with a JavabinDecoder only so that the class is simple
Hi,
the problem you've described -- an integration of DataImportHandler (to
traverse the XML file and get the document urls) and Solr Cell (to
extract content afterwards) -- is already addressed in issue SOLR-1358
(https://issues.apache.org/jira/browse/SOLR-1358).
Best,
Sascha
Kerwin wrote:
What I could try to say is that if you want to index a Pdf, then you should
use a Pdf extractor. A Pdf Extractor is able to extract the text content and
the metadata of the files. I suppose you have just opened and indexed the
pdf as is. So you stored bynary data and stop. For my applciation I've u
Anyone has a clue?
> List,
>
>
> I somehow fail to index certain pdf files using the
> ExtractingRequestHandler in Solr 1.4 with default solrconfig.xml but
> modified schema. I have a very simple schema for this case using only
> and ID field, a timestamp field and two dynamic fields; ignored_
Hi,
I am new to this forum and would like to know if the function described
below has been developed or exists in Solr. If it does not exist, is it a
good Idea and can I contribute.
We need to index multiple documents with different formats. So we use Solr
with Tika (Solr Cell).
Question:
Can yo
45 matches
Mail list logo