query.setRows(Integer.MAX_VALUE);
Cheers
Avlesh
On Thu, Jul 23, 2009 at 8:15 AM, shb suh...@gmail.com wrote:
When I use
SolrQuery query = new SolrQuery();
query.set(q, issn:0002-9505);
query.setRows(10);
QueryResponse response = server.query(query);
if I use query.setRows(Integer.MAX_VALUE);
the query will become very slow, because searcher will go
to fetch the filed value in the index for all the returned
document.
So if I set query.setRows(10), is there any other ways to
get all the ids? thanks
2009/7/23 Avlesh Singh avl...@gmail.com
Have you tried limiting the fields that you're requesting to just the
ID?
Something along the line of:
query.setRows(Integer.MAX_VALUE);
query.setFields(id);
Might speed the query up a little.
On 23 Jul 2009, at 09:11, shb wrote:
Here id is indeed the uniqueKey of a document.
I want to get
I have tried the following code:
query.setRows(Integer.MAX_VALUE);
query.setFields(id);
when it return 1000,000 records, it will take about 22s.
This is very slow. Is there any other way?
2009/7/23 Toby Cole toby.c...@semantico.com
Have you tried limiting the fields that you're requesting
Hi,
I am new to Solr and I want to get a quick hint if it is suitable for
what we want to use it for.
We are building e-mail platform and we want to provide our users with
full-text search functionality.
We are not willing to use single index file for all users as we want
to be able to migrate
On Thu, Jul 23, 2009 at 3:06 PM, Łukasz Osipiuk luk...@osipiuk.net wrote:
I am new to Solr and I want to get a quick hint if it is suitable for
what we want to use it for.
We are building e-mail platform and we want to provide our users with
full-text search functionality.
We are not
On Tue, 21 Jul 2009 14:25:52 +0200, Anders Melchiorsen wrote:
On Fri, 17 Jul 2009 16:04:24 +0200, Anders Melchiorsen wrote:
However, in the normal highlighter, I am using usePhraseHighlighter and
highlightMultiTerm and it seems that there is no way to turn these on in
Chantal,
You might consider LuSql[1].
It has much better performance than Solr DIH. It runs 4-10 times faster on a
multicore machine, and can run in 1/20th the heap size Solr needs. It
produces a Lucene index.
See slides 22-25 in this presentation comparing Solr DIH with LuSql:
https://issues.apache.org/jira/browse/SOLR-920
Where would the shared schema.xml be located (same as solr.xml?), and how
would dynamic schema play into this? Would each core's dynamic schema still be
independent?
shareSchema tries to see if the schema.xml from a given file and
timestamp is already loaded . if yes ,the old object is re-used.
All the cores which load the same file will share a single object
On Thu, Jul 23, 2009 at 3:32 PM, Brian Klippelbr...@theport.com wrote:
On Thu, Jul 23, 2009 at 3:32 PM, Brian Klippel br...@theport.com wrote:
https://issues.apache.org/jira/browse/SOLR-920
and how would dynamic schema play into this? Would each core's dynamic
schema still be independent?
I guess you mean dynamic fields. If so, then yes, you will still be
On Thu, Jul 23, 2009 at 4:30 PM, Łukasz Osipiuk luk...@osipiuk.net wrote:
See https://issues.apache.org/jira/browse/SOLR-1293
We're planning to put up a patch soon. Perhaps we can collaborate?
What are your estimations to have this patches ready. We have quite
tight deadlines
and
Hi Paul, hi Glen, hi all,
thank you for your answers.
I have followed Paul's solution (as I received it earlier). (I'll keep
your suggestion in mind, though, Glen.)
It looks good, except that it's not creating any documents... ;-)
It is most probably some misunderstanding on my side, and
Hi,
I am new to Solr and need help with the following use case:
I want to provide faceted browsing. For a given product, there are multiple
descriptions (feeds, the description being 100-1500 words) that my
application gets. I want to check for the presence of a fixed number of
terms or attributes
Try out this with SolrJ
SolrQuery query = new SolrQuery();
query.setQuery(q);
// query.setQueryType(dismax);
query.setFacet(true);
query.addFacetField(id);
query.addFacetField(text);
query.setFacetMinCount(2);
On Thu, Jul 23, 2009 at 5:12 PM, Nishant Chandra
Is there a uniqueKey in your schema ? are you returning a value
corresponding to that key name?
probably you can paste the whole data-config.xml
On Thu, Jul 23, 2009 at 4:59 PM, Chantal
Ackermannchantal.ackerm...@btelligent.de wrote:
Hi Paul, hi Glen, hi all,
thank you for your answers.
I
Hi Paul,
no, I didn't return the unique key, though there is one defined. I added
that to the nextRow() implementation, and I am now returning it as part
of the map.
But it is still not creating any documents, and now that I can see the
ID I have realized that it is always processing the
Note that the statement about LuSql (or really any other tool, LuSql is just an
example because it was mentioned) is true only if Solr is underutilized because
DIH uses a single thread to talk to Solr (is this correct?) vs. LuSql using
multiple (I'm guessing that's the case becase of the
You could pull the ID directly from the Lucene index, that may be a little
faster.
You can also use Lucene's TermEnum to get to this.
And you should make sure that id field is the first field in your documents
(when you index them).
But no matter what you do, this will not be subsecond for
Hallo...
I have a problem...
i want to sort a field
at the Moment the field type is text, but i have test it with string or
date
the content of the field looks like 22.07.09 it is a Date.
when i sort, i get :
failed to open stream: HTTP request failed! HTTP/1.1 500
On Jul 23, 2009, at 11:03 AM, Jörg Agatz wrote:
Hallo...
I have a problem...
i want to sort a field
at the Moment the field type is text, but i have test it with
string or
date
the content of the field looks like 22.07.09 it is a Date.
when i sort, i get :
failed to open stream: HTTP
Rather than trying to get all document id's in one call to Solr,
consider paging through the results. Set rows=1000 or probably
larger, then check the numFound and continue making requests to Solr
incrementing start parameter accordingly until done.
Erik
On Jul 23, 2009, at 5:35
Hi Otis,
Yes, you are right: LuSql is heavily optimized for multi-thread/multi-core.
It also performs better on a single core with multiple threads, due to
the heavy i/o bounded nature of Lucene indexing.
So if the DB is the bottleneck, well, yes, then LuSql and any other
tool are not going
I want to exclude a very small number of terms which will be different for
each query. So I think my best bet is to use localParam.
Bill
On Wed, Jul 22, 2009 at 4:16 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:
: I am faceting based on the indexed terms of a field by using facet.field.
Give it is a small number of terms, seems like just excluding them
from use/visibility on the client would be reasonable.
Erik
On Jul 23, 2009, at 11:43 AM, Bill Au wrote:
I want to exclude a very small number of terms which will be
different for
each query. So I think my best
That's actually what we have been doing. I was just wondering if there is
any way to move this work from the client back into Solr.
Bill
On Thu, Jul 23, 2009 at 11:47 AM, Erik Hatcher
e...@ehatchersolutions.comwrote:
Give it is a small number of terms, seems like just excluding them from
And if I may add another thing - if you are using Solr in this fashion, have a
look at your caches, esp. document cache. If your queries of this type are
repeated, you may benefit from large cache. Or, if they are not, you may
completely disable some caches.
Otis
--
Sematext is hiring:
Thanks for the response, Eric.
We have seen that size of the index has a direct impact on the search
speed, especially when the index size is in GBs, so trying all
possible ways to keep the index size as low as we can.
We thought solr.ExternalFileField type would help to keep the index
I'm not sure if there is a lot of benefit from storing the literal values in
that external file vs. directly in the index. There are a number of things one
should look at first, as far as performance is concerned - JVM settings, cache
sizes, analysis, etc.
For example, I have one index here
Hi,
I noticed that the backup request
http://master_host:port/solr/replication?command=backuphttp://master_host/solr/replication?command=backup
works only if there are committed index data, i.e.
core.getDeletionPolicy().getLatestCommit() is not null. Otherwise, no backup
is created. It sounds
On Jul 21, 2009, at 11:57 AM, JCodina wrote:
Hello, Grant,
there are two ways, to implement this, one is payloads, and the
other one is
multiple tokens at the same positions.
Each of them can be useful, let me explain the way I thick they can
be used.
Payloads : every token has extra
i'm trying to do some filtering in the count list retrieved by solr when
doing a faceting query ,
i'm wondering how can i use facet.prefix to gem something like this:
Query
facet.field=foofacet.prefix=A OR B
Response
lst name=facet_fields
-
lst name=foo
int name=A12560/int
int
: Here id is indeed the uniqueKey of a document.
: I want to get all the ids for some other useage.
http://people.apache.org/~hossman/#xyproblem
XY Problem
Your question appears to be an XY Problem ... that is: you are dealing
with X, you are assuming Y will help you, and you are asking about
Another options is making backups more directly, not using the Solr backup
mechanism.
Check the green link on http://www.manning.com/hatcher3/
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original
Thanks for the quick response, Otis.
We have been able to achieve the ratio of 2 with different settings,
however, considering the huge volume of the data that we need to deal
with - 600 GB of data per day, and, we need to keep it in the index
for 3 days - we're looking at all possible
Jibo,
Well, there is always field compression, which lets you trade the index
size/disk space for extra CPU time and thus some increase in indexing and
search latency.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP,
Found my own answer, use the literal parameter. Should have dug
around before asking. Sorry.
Thanks,
Matt Weber
eSr Technologies
http://www.esr-technologies.com
On Jul 23, 2009, at 2:26 PM, Matt Weber wrote:
Is it possible to supply addition metadata along with the binary
file when
Hi Ryan,
Thanks for the information.
Is this expected to be implemented?
Regards,
--
Daniel Cassiano
_
http://www.apontador.com.br/
http://www.maplink.com.br/
On Wed, Jul 22, 2009 at 10:08 PM, Ryan McKinley ryan...@gmail.com wrote:
ya... 'expected', but perhaps
Hi,
I'm attempting to setup a simple joined index of some tables with the following
structure...
EMPLOYEEORGANIZATION
-
employee_id organization_id
first_name organization_name
last_name
edr_party_id
organization_id
When
Sure Otis, and in fact I can narrow it down to just exactly that query,
but with user queries I don't think it is right to throw an exception
out of phonetic filter factory if the user enters a number. What I am
saying is am I going to have to filter the user queries for numerics
before using it
Hey I just noticed that this only happens when I enable debug. If
debugQuery=true is on the URL then it goes through the debugging
component and that is throwing this exception. It must be getting an
empty field object from the phonetic filter factory for numbers or
something similar
Actually my first question should be, Is this a known bug or am I doing
something wrong?
The only one thing I can find on this topic is the following statement
on the solr-dev group when discussing adding the maxCodeLength, see
point two below:
Ryan McKinley updated SOLR-813:
I've downloaded solr-2009-07-21.tgz and followed the instructions at http://drupal.org/node/343467
including retrieving the solrconfig.xml and schema.xml files from the Drupal apachesolr module.
The server seems to start properly with the original solrconfig.xml and
schema.xml files
When I
I think the problem is CharStreamAwareWhitespaceTokenizerFactory, which used to
live in Solr (when Drupal schema.xml for Solr was made), but has since moved to
Lucene. I'm half guessing. :)
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta,
44 matches
Mail list logo