On Tue, Nov 4, 2008 at 1:31 AM, Lance Norskog <[EMAIL PROTECTED]> wrote:
> Thank you for the "rootEntity" tip. Does this mean that the inner loop only
> walks the first item and breaks out of the loop? This is very good because it
> allows me to drill down a few levels without downloading 10,000
I am trying to decide if this is a solr or a lucene problem, using solr
1.3:
take this example --
(-productName:"whatever") OR (anotherField:"Johnny")
I would expect to get back records that have anotherField=Johnny, but
also, any records that don't have 'whatever' as the productName.
However
>From the data-config.xml it is obvious that the your indexing will
take a lot of time. MySql has very poor join performance. It is not a
very good idea to run this on a production database.
I would suggest you to configure another mysql server and do mysql
replication to that and run the import f
The attribute name is batchSize="-1" (it is case sensitive) . Tjis
ensures that Mysql driver fetcches row by row
http://wiki.apache.org/solr/DataImportHandlerFaq
On Mon, Nov 3, 2008 at 9:17 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
>
> Hi Shalin,
> *
> I would like to know if you just used batchsize
The logistics of handling giant index files hit us before search
performance. We switched to a set of indexes running inside one server
(tomcat) instance with the Multicore+Distributed Search tools, with a frozen
old index and a new index actively taking updates. The smaller new index
takes much le
Hi,
You could look at the scoring explanation with &debugQuery=true, and I think
you'd see that this is because of the TF (term frequency) for terms blues and
brothers. You can think/visualize this as "two for two" for that first hit -
the field has 2 terms and both of them match your search t
Chris Hostetter wrote:
: > I'm not sure if there's any reason for solr-core to declare a maven
: > dependency on solr-solrj.
: When creating the POMs, I had (incorrectly) assumed that the core jar does
: not contain SolrJ classes, hence the dependency.
I consider it a totally justifiable assump
If you never execute any queries, a gig should be more than enough.
Of course, I've never played around with a .8 billion doc corpus on
one machine.
-Mike
On 3-Nov-08, at 2:16 PM, Alok Dhir wrote:
in terms of RAM -- how to size that on the indexer?
---
Alok K. Dhir
Symplicity Corporation
We have one field that is a simple text field, not multivalue.
content0
We are populating music, atrist song etc in one string.
content0:(blues brothers)
Returns : (default desc score)
BluesBrothers01.mp3
Breaux_Brothers_Tiger_Rag_Blues.mp3
Blues Brothers - Theme From Rawhide V1.mp
That depends largely on your ramBufferSizeMB setting in solrconfig.xml and the
memory you are willing to give to the JVM via -Xmx.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Alok Dhir <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.o
in terms of RAM -- how to size that on the indexer?
---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]
On Nov 3, 2008, at 4:07 PM, Walter Underwood wrote:
The indexing box can be much smaller, especially in terms of CPU.
It just needs one fast th
Seeking a SOLR consultant..
We have a working model web based search engine that uses SOLR/java/apache.
However the relevance isnt exactly what we would like...
The system was built by a contractor no longer available for work and we ha=
ve tried to hack around but would prefer to hire someone w
The indexing box can be much smaller, especially in terms of CPU.
It just needs one fast thread and enough disk.
wunder
On 11/3/08 2:58 PM, "Alok Dhir" <[EMAIL PROTECTED]> wrote:
> I was afraid of that. Was hoping not to need another big fat box like
> this one...
>
> ---
> Alok K. Dhir
> Symp
Hi,
i'm using CoreContainer in jRuby. I'd like my data directory to be the
standard solr-home/data. But since CoreContainer == multi-core, I need to
supply a core name. Is it possible to use CoreContainer without a "core"? is
it possible to set the dataDir? Also, it seems that no matter what I set
I was afraid of that. Was hoping not to need another big fat box like
this one...
---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]
On Nov 3, 2008, at 4:53 PM, Feak, Todd wrote:
I believe this is one of the reasons that a master/slave configu
I believe this is one of the reasons that a master/slave configuration
comes in handy. Commits to the Master don't slow down queries on the
Slave.
-Todd
-Original Message-
From: Alok Dhir [mailto:[EMAIL PROTECTED]
Sent: Monday, November 03, 2008 1:47 PM
To: solr-user@lucene.apache.org
Su
We've moved past this issue by reducing date precision -- thanks to
all for the help. Now we're at another problem.
There is relatively constant updating of the index -- new log entries
are pumped in from several applications continuously. Obviously, new
entries do not appear in searches
Hi,
I'm new to Solr. Here is a query on distributed search.
I have huge volume of log files which I would like to search. Apart from
generic test search I would also like to get statistics - say each record has a
field telling request processing time and I would like to get average of
processi
Hey your are right,
I'm trying to migrate my app to solr. For the moment I am using solr for the
searching part of the app but i am using my own lucene app for indexing,
Shoud have posted in lucene forum for this trouble. Sorry about that.
Iam trying to use termdocs properly now.
Thanks for your a
On 28-Oct-08, at 5:36 AM, Jérôme Etévé wrote:
Hi all,
In my code, I'd like to keep a subset of my 14M docs which is around
100k large.
What is according to you the best option in terms of speed and
memory usage ?
Some basic thoughts tells me the BitDocSet should be the fastest for
lookup
Thank you for the "rootEntity" tip. Does this mean that the inner loop only
walks the first item and breaks out of the loop? This is very good because it
allows me to drill down a few levels without downloading 10,000 feeds. (Public
API sites tend to dislike this behavior :)
The URL is wrong be
On Mon, Nov 3, 2008 at 2:49 PM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Is this your code or something from Solr?
> That indexSearcher = new IndexSearcher(path_index) ; is very suspicious
> looking.
Good point... if this is a Solr plugin, then get the SolrIndexSearcher
from the request obje
Is this your code or something from Solr?
That indexSearcher = new IndexSearcher(path_index) ; is very suspicious looking.
Are you creating a new IndexSearcher for every search request? If so, that's
the cause of your memory problem.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nu
On Mon, Nov 3, 2008 at 2:40 PM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
> As hits is deprecated I tried to use termdocs and top docs...
Try using searcher.getFirstMatch(t) as Jonathan is. It should be
faster than Hits.
> but the memory
> problem never disapeared...
> If I call the garbage colle
Hey there,
I never run out of memory but I think the app always run to the limit... The
problem seems to be in here (searching by term):
try {
indexSearcher = new IndexSearcher(path_index) ;
QueryParser queryParser = new QueryParser("id_field",
getAnalyzer(stop
On Mon, Nov 3, 2008 at 12:37 PM, George <[EMAIL PROTECTED]> wrote:
> Ok Yonik, thank you.
>
> I've tried to execute the following query: "{!boost b=log(myrank)
> defType=dismax}q" and it works great.
>
> Do you know if I can do the same (combine a DisjunctionMaxQuery with a
> BoostedQuery) in solrc
Have you looked into the "bf" and "bq" arguments on the
DisMaxRequestHandler?
http://wiki.apache.org/solr/DisMaxRequestHandler?highlight=(dismax)#head
-6862070cf279d9a09bdab971309135c7aea22fb3
-Todd
-Original Message-
From: George [mailto:[EMAIL PROTECTED]
Sent: Monday, November 03, 200
Ok Yonik, thank you.
I've tried to execute the following query: "{!boost b=log(myrank)
defType=dismax}q" and it works great.
Do you know if I can do the same (combine a DisjunctionMaxQuery with a
BoostedQuery) in solrconfig.xml?
George
On Sun, Nov 2, 2008 at 3:01 PM, Yonik Seeley <[EMAIL PROTEC
Hi,
We're indexing a lot of dirty OCR. So the index is really huge due to
the size of the position file. We still get ok response time though
with a median of 100ms. Phrase queries are a different matter
obviously. But we're seeing some really large increases in index size
as we add a cou
Hi,
I've put a batchsize parameter at -1, it works fine, the point is I will
monopolize the MySql's database for 10hours.
And other request on it like update, or other process will be stack. And if
I don't use batchsize -1 I will have an OOM error like below. I tried to put
batchsize 1000 or 1 bu
Hi Shalin,
*
I would like to know if you just used batchsize = -1.
When I do that I use all Mysql's memory and it's a problem for the database
and other process on it like update insert ...
It will keep my database busy for 10hours, it's too much, is there a way to
manage it differently ?
Thank
Hi,
I tried batchSize =-1 but when I'm doing that I will use all mysql's memory
and it's a problem for mysql's database.
:s
Noble Paul നോബിള് नोब्ळ् wrote:
>
> I've moved the FAQ to a new Page
> http://wiki.apache.org/solr/DataImportHandlerFaq
> The DIH page is too big and editing has become
On Sun, Nov 2, 2008 at 8:09 PM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
> I am doing the same and I am experimenting some trouble. I get the document
> data searching by term. The problem is that when I do it several times
> (inside a huge for) the app starts increasing the memory use until I use
On Mon, Nov 3, 2008 at 6:21 AM, Kraus, Ralf | pixelhouse GmbH
<[EMAIL PROTECTED]> wrote:
> I have a "little" performence problem with SOLR (again).
> My searches delivering 30 rows per site (for my webpage) BUT I need to get
> about 500 primary keys (IDs).
> Right now I am searching for my 30 compl
Do you have an idea ?
sunnyfr wrote:
>
> Sorry I wasn't clear,
> The stack is not on solr database or index query, stack request are on our
> main database MySql,
> When I do a full import to create indexes for solr, MySql honnor it and
> won't drive it OOM, but with a batchsize -1, it uses My
Hello,
I have a "little" performence problem with SOLR (again).
My searches delivering 30 rows per site (for my webpage) BUT I need to
get about 500 primary keys (IDs).
Right now I am searching for my 30 complete rows (all fields) and
another search with the setting "fl=primaryID".
Unfortunat
36 matches
Mail list logo