Yes, the xml files are in complete add format.
This is my code:
#!/usr/bin/perl
if (($#ARGV + 1) = 0 ) {
print Usage: perl prod.pl dir \n\n;
exit(1);
}
## -- CHANGE accordingly
$timeout = 300;
$topdir = /opt/Test/xml-file/;
#$topdir = /opt/Test/;
$dir =
I also commit too many I guess, since we have 1000 folders, so each loop will
executed the load and commit.
So 1000 loops with 1000 commits. I think it will be help if I only commit once
after the 1000 loops completed.
Any inputs?
Thhanks
Francis
-Original Message-
From: Francis
Done. Unfortunately with the same result. :confused:
Thanks, Jun.
Isn't it really strange? Again, I'm not the first person using Solr. I
wonder if the matter might be just local, due to some not so obvious reason
manifesting itself only on my machine (what is, of course, very unlikely but
still
Thank you Chris.
I've find out how to implement my faceted search.
I don't index any metadata document but i create my in-memory faceting data
structure from database at my request handlers init method. Compute facet
count on any request and wrrite to response as NamedList of NemedLists.
Is it best way to implement my own Locking mechanism here?
Thanks
/Renz
2009/7/10 Renz Daluz renz052...@gmail.com
Hi all,
I have 2 workers running (app that's builds the index) and both are
pointing to the same Solr (1.3.0) master instance when updating/committing
documents. I'm using SolrJ
Yes, it works :-) Thanks Erik!
I am using the dismax query parser syntax for the fq param:
.../select?qt=dismaxrows=30q.alt=*:*qf=contentfq={!dismax
qf=contentKeyword^1.0 mm=0%}Foofq=+date:[2009-03-11T00:00:00Z TO
2009-07-09T16:41:50Z]fl=id,date,content
Now, I want to add one
On Fri, Jul 10, 2009 at 11:50 AM, Francis Yakin fya...@liquid.com wrote:
I also commit too many I guess, since we have 1000 folders, so each loop
will executed the load and commit.
So 1000 loops with 1000 commits. I think it will be help if I only commit
once after the 1000 loops completed.
How you batching all documents in one curl call? Do you have a sample, so I can
modify my script and try it again.
Right now I do curl on each documents( I have 1000 docs on each folder and I
have 1000 folders) using :
curl http://localhost:7001/solr/update --data-binary @abc.xml -H
Hi.
Apologies for bumping this one, but another question occurred to me... is
there a limit to the number of ext.literal components I can put in my curl
command... if so, i will definitely need to find another way to get this
data in, as I am building up relationships between documents, and
I'm in the same situation, but is not getting what this ant example is about.
Can't find anything in solr about it. Could I get anyone to write a little
more specific what one have to do to get rid of the Error loading class
'org.apache.solr.handler.extraction.ExtractingRequestHandler' exception.
On Fri, Jul 10, 2009 at 1:17 PM, Francis Yakin fya...@liquid.com wrote:
How you batching all documents in one curl call? Do you have a sample, so I
can modify my script and try it again.
Right now I do curl on each documents( I have 1000 docs on each folder and
I have 1000 folders) using :
Tushar:
Is it necessary to do the optimize on each iteration? When you run an
optimize, the entire index is rewritten. Thus each index file can have at
most one hard link and each snapshot will consume the full amount of space
on your disk.
Asir
On Thu, Jul 9, 2009 at 3:26 AM, tushar kapoor
Hello,
I've got a stored, indexed field that contains some actual text, and some
metainfo, like this:
one two three four [METAINFO] oneprime twoprime threeprime fourprime
I have written a Tokenizer that skips past the [METAINFO] marker and uses
the last four words as the tokens for the field,
On Jul 9, 2009, at 5:37 PM, A. Steven Anderson wrote:
A simple example would be if a schema included a phoneNum mulitValue
field
and I wanted to return all docs that contained more than 1 phoneNum
field
value.
all docs that contain more than one phone number - regardless of
matching a
On Jul 9, 2009, at 5:37 PM, A. Steven Anderson wrote:
A simple example would be if a schema included a phoneNum mulitValue
field
and I wanted to return all docs that contained more than 1 phoneNum
field
value.
all docs that contain more than one phone number - regardless of
matching a
all docs that contain more than one phone number - regardless of matching a
particular query?
Exactly.
knowing that was a useful query, i'd change my indexer to also provide
either a field with the count of phone number values, or a boolean field
saying whether there are more than one or
On Fri, Jul 10, 2009 at 5:56 PM, Michael _ solrco...@gmail.com wrote:
Hello,
I've got a stored, indexed field that contains some actual text, and some
metainfo, like this:
one two three four [METAINFO] oneprime twoprime threeprime fourprime
I have written a Tokenizer that skips past the
Shalin Shekhar Mangar wrote:
Can't you have two fields like this?
f1 (indexed, not stored) - one two three four [METAINFO] oneprime
twoprime
threeprime fourprime
f2 (not indexed, stored) - one two three four
Perhaps I don't understand highlighting, but won't that prevent snippets
markrmiller wrote:
Coming soon. First step was here:
http://issues.apache.org/jira/browse/LUCENE-1699
Trunk doesn't have that version of Lucene yet though (I believe thats
still
the case).
Replacing the RunUpdateProcessor give you full control of the Lucene
document creation.
Is
On Fri, Jul 10, 2009 at 2:02 PM, solrcoder solrco...@gmail.com wrote:
markrmiller wrote:
Coming soon. First step was here:
http://issues.apache.org/jira/browse/LUCENE-1699
Trunk doesn't have that version of Lucene yet though (I believe thats
still
the case).
Replacing the
I have noticed a weird behabiour doing score testing. I do a search using
dismax request handler with no extra boosting in a index of a milion docs
searching in five fields.
Printing the score of the docs 3th,4th,5fh,6th I can see that is the same.
If I build the index with my own lucene indexer
Why do you care? I'm not being too much of a jerk here, becausescores
between separate queries are irrelevant. See:
http://wiki.apache.org/lucene-java/ScoresAsPercentages
http://wiki.apache.org/lucene-java/ScoresAsPercentagesSo, the scores
aren't important, the important thing is whether
the
Hi all,
If I have two fields that are copied into a copyField, and I index
data in these fields using different index-time boosts, are those
boosts propagated into the copyField?
Thanks!
Mat
On Jul 9, 2009, at 11:58 PM, Sumit Aggarwal wrote:
Hi,
1. Calls made to multiple shards are made in some concurrent fashion
or
serially?
Concurrent
2. Any idea of algorithm followed for merging data? I mean how
efficient it
is?
Not sure, but given that Yonik implemented it, I
Well I was asking it because I have a custom FieldComparatorSource that uses
lucene score among other params to calculate the sorting. The thing is that
with my own lucene servlet I am getting different results than using solr
now (because score values are different and Solr is giving me back the
markrmiller wrote:
When you specify a custom UpdateProcessor chain, you will normally make
the
RunUpdateProcessor the last processor in the chain, as it will add the doc
to Solr.
Rather than using the built in RunUpdateProcessor though, you could simply
specify your own UpdateProcessor
Thanks Bill. Couple of questions,
1) Would the function query load all unique terms (for that field) in
memory the way sort (field cache) does? If so, that wouldn't work for
us as we can have over 5 billion records spread across multiple shards
(up to 10 indexer instances), that would surely kill
Hi,
I'm building an application that dynamically instantiates a large number of
solr cores on a single machine (large would ideally be as high as I can get
it, in the millions, if it is possible to do so without significant
performance degradation and/or system failure). I already tried this
Does the facet aggregation take place on the Solr search server, or
the Solr client?
It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50
million document index (about 36M unique values in the author
field), a query that returns 131,000 hits takes about 20 seconds to
calculate the
Marc Sturlese wrote:
I have been able to create my custom field. The problem is that I have
laoded in the solr core a couple of HashMapsid_doc,value_influence_sort
from a DB with values that will influence in the sort. My problem is that
I don't know how to let my custom sort have access
I have found that the stamming in solr 1.2 and 1.3 is different for
communication. We have index built in Solr 1.2 and the index is being
queried by 1.3. Is there any way to adjust it?
Jae joo
I am investigating the possibilities of preprocessing my data before it is
indexed. Specifically, I would like to add fields or modify field values
based on other fields in the XML I am injecting.
I am a little confused on where this is supposed to happen; whether as part
of the
On Fri, Jul 10, 2009 at 6:40 PM, jonarino jonathan.h...@verizonwireless.com
wrote:
I am investigating the possibilities of preprocessing my data before it is
indexed. Specifically, I would like to add fields or modify field values
based on other fields in the XML I am injecting.
I am a
Sorry. From the CHANGES for 1.3:
{quote}
The Porter snowball based stemmers in Lucene were updated (LUCENE-1142),
and are not guaranteed to be backward compatible at the index level
(the stem of certain words may have changed). Re-indexing is recommended.
{/quote}
Would have been nice to leave a
Does the facet aggregation take place on the Solr search server, or the
Solr client?
Solr server.
Faceting is an expensive operation by nature, especially when the hits are
large in number. Solr caches these values once computed. You might want to
tweak cache related parameters in your solr
On Fri, Jul 10, 2009 at 3:42 PM, solrcoder solrco...@gmail.com wrote:
markrmiller wrote:
When you specify a custom UpdateProcessor chain, you will normally make
the
RunUpdateProcessor the last processor in the chain, as it will add the
doc
to Solr.
Rather than using the built in
Hi everybody,
Let's say we have 10,000 traveling sales-people spread throughout the
country. Each of them has has their own territory, and most of the
territories overlap (eg. 100 sales-people in a particular city alone). Each
of them also has a maximum distance they can travel. Some can
I am guessing that the field is actually just a string or a really long
word. Solr looks for occurrences of the term/token. It does not however
search within a given token without the *. So in your example the system
will not match thisisavery with thisisaverylongtesttitle even though they
have
Hi,
I'm experimenting with Solr components. I'd like to be able to use a
nice-high-level querying interface like the DirectSolrConnection or
EmbeddedSolrServer provides. Would it be considered absolutely insane to use
one of those *within a component* (using the same core instance)?
Matt
It could be that you should be providing an implementation of
SortComparatorSource
I have missed the earlier part of this thread, I assume you're trying to
implement some form of custom search?
B
dontthinktwice wrote:
Marc Sturlese wrote:
I have been able to create my custom field. The
The easiest modification is to use:
calc_square_of_distance(CLIENT_LAT, CLIENT_LONG, lat, long)
maxSquareOfTravelDist
This has the same ordering as before, but is much cheaper to calculate. You can
then calculate the actual distance in the GUI, where you're only showing a
handful of values.
okobloko wrote:
It could be that you should be providing an implementation of
SortComparatorSource
I have missed the earlier part of this thread, I assume you're trying to
implement some form of custom search?
B
Yes, exactly. What I'm trying to do is sort the results of an
: :(
:
: that is all we have in there!!!
:
: Is there any way I can raise the logging level for it?
it's not an issue of logging level -- that just affects which types of
messages get logged, this message is getting logged so the level is fine.
The problem is the log formatting. if this is
: data). I given the prepared url in URL calss i got the HTTP Version Not
: Supported and the error code is 505.
Solr never generates that error code. what servlet container are you
using?
: String urlStr = solrUrl + /update?stream.body= + strToAdd;
:
Mat Brown wrote:
Hi all,
If I have two fields that are copied into a copyField, and I index
data in these fields using different index-time boosts, are those
boosts propagated into the copyField?
Thanks!
Mat
No, but the norms of source fields of copyField are propagated
into the
Hello,
I have a SOLR JMX connection issue. I am running my JMX MBeanServer through
Tomcat, meaning I am using Tomcat's MBeanServer rather than any other
MBeanServer implemenation.
I am having a hard time trying to figure out the correct JMX Service URL on my
localhost for the accessing the
46 matches
Mail list logo