For anyone interested, my issue (I think) was because I had specified
the url field as a multivalued field. I wasn't able to create a test
case that emulated my problem. This guess is based on gradual fiddling
with my configs.
My concern is no longer pressing but I do have a couple
On 13.03.2010, at 08:01, blargy wrote:
I was actually able to accomplish (althought not pretty) what I wanted using
a regex transformer.
entity name=item
transformer=RegexTransformer
query=select *, 'valueA, valueB' values from items
field
Ok, let me try and explaining what I am hoping to achieve at a higher level:
I want to aggressively remove stop words to reduce the size of my index, but
there are certain domain specific multiword phrases which include stop words
that I need to retain in the index.
So I want to stop out words
Christopher,
maybe the SynonymFilter can help you to solve your problem.
Let me try to explain:
If you create an extra field in the index for your use-case, you can boost
matches of them in a special way.
The next step is creating an extra synonym-file.
as much as = SpecialPhrase1
in amount
Hi,
How do we combine clustering component and Dismax query handler?
Regards,
allahbaksh
Hi,
I am using CachedsqlEntityProcessor in my DIH dataconfig to reduce the
number of queries executed against the database ,
Entity1 query=select * from x processor= CachedsqlEntityProcessor/
Entity2 query=select * from y processor= CachedsqlEntityProcessor
cachekey=id cachelookup=x.id/
I
How can I enable logging of all the xml posted to my Solr server? Is this
possible? As of right now all I see in the logs are the request params when
querying.
While I am on the topic of logging I have one other question too. Is it
possible to use custom variables in the logging.properties file
Is there any documentation on this screen? (and dont point me
http://wiki.apache.org/solr/DataImportHandler)
When using the Full-import, Status, Reload-Config, Document-Count and Full
Import With Cleaning everything works as expected but when I use any of the
following I get an exception: Debug
Sorry forgot to attach the error log,
Error Log:
-
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMe
moryError: Java heap space
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
r.java:650)
at
Also how would one auto-commit after a delta-import?
I click on the commit, clean and verbose checkboxes but those seem to have
no affect.
blargy wrote:
Is there any documentation on this screen? (and dont point me
http://wiki.apache.org/solr/DataImportHandler)
When using the
Have you searched the users' list? This question has come up multiple times
and you'll find your question has probably already been answered. Let us
know if you come up blank...
Best
Erick
On Sat, Mar 13, 2010 at 3:56 PM, JavaGuy84 bbar...@gmail.com wrote:
Sorry forgot to attach the error
HTMLStripCharFilter is only in the analyzer: it creates searchable
terms from the HTML input. The raw HTML is stored and fetched.
There are some bugs in term positions and highlighting, An
EntityProcessor wrapping the HTMLStripCharFIlter would be really
useful.
On Tue, Mar 9, 2010 at 5:31 AM,
Erik,
I have seen many posts regarding out of memory error but I am not sure
whether they are using cachesqlEntityProcessor..
I want to know if there is a way to flush out the buffer of cache instead of
storing everything in cache.
I can clearly see the heapsize growing like anything if I use
One way is to add magic 'beginning' and 'end' terms, then do phrase
searches with those terms.
On Wed, Mar 10, 2010 at 7:51 AM, Jan Høydahl / Cominvent
jan@cominvent.com wrote:
Hi,
Sometimes you need to anchor your search to start/end of field.
Example:
1. title=New York Yankees
2.
It is usually a limitation in the servlet container. You could try
using embedded Solr or using an HTTP POST instead of an HTTP GET.
However, in this case it is probably not possible.
If these long filter queries never change, you could embed these in
the solrconfig.xml declaration for a request
You might also try using CDATA blocks to wrap your Unicode text. It is
usually much easier to view the text while debugging these problems.
On Thu, Mar 11, 2010 at 12:13 AM, Eric Pugh
ep...@opensourceconnections.com wrote:
So I am using Sunspot to post over, which means an extra layer of
I don't really follow DataImportHandler, but it looks like its using an
unbounded cache (simple HashMap).
Perhaps we should make the cache size configurable?
The impl seems a little odd - the caching occurs in the base class - so
caching impls that extends it don't really have full control -
What is timing out? The external HTTP request? Commit times are a
sawtooth and slowly increase. My record is 59 minutes, but I was doing
benchmarking.
On Thu, Mar 11, 2010 at 1:46 AM, Frederico Azeiteiro
frederico.azeite...@cision.com wrote:
Hi,
I'm having timeouts commiting on a 125 GB index
Yes, the http request is timing out even when using values of 10m.
Normally the commit takes about 10s. I did an optimize (it took 6h) and it
looks good for now...
59m? well i didn't wait that long, i restarted the solr instance and tried
again.
I'll try to use autocommit on a near
How are you guys solving the problem with managing all of your configuration
difference between development and production.
For example when deploying to production I need to change the
data-config.xml (DataImportHandler) database settings. I also have some ant
scripts to start/stop tomcat as
You can use mysql , select *, “staticdata” as staticdata from table x.
As long as your field name is staticdata, this should add it there.
On 3/12/10 8:39 AM, Tommy Chheng tommy.chh...@gmail.com wrote:
Haven't tried this myself but try adding a default value and don't
specify it during the
DIH has special handling for upper lower case field names. It is
possible your config is running afoul of this.
Try using different names for the Solr fields than the database fields.
On 3/11/10, James Ostheimer james.osthei...@gmail.com wrote:
Hi-
I can't seem to make any of the
Commit actions are in the jetty log. I don't have a script to pull
them out in a spread-sheet-able form, but that would be useful.
On 3/13/10, Frederico Azeiteiro frederico.azeite...@cision.com wrote:
Yes, the http request is timing out even when using values of 10m.
Normally the commit takes
On 03/12/2010 09:44 AM, Shawn Heisey wrote:
Does SolrCloud's notion of a collection, which appears to use
cores, override normal multi-core usage for building an offline index
and quickly swapping it into production?
A collection will normally be composed of multiple cores. By default
On 03/09/2010 04:28 PM, Shawn Heisey wrote:
I attended the Webinar on March 4th. Many thanks to Yonik for
putting that on. That has led to some questions about the best way
to bring fault tolerance to our distributed search. High level
question: Should I go with SolrCloud, or stick with
Thank you for the idea Mitch, but it just doesn't seem right that I should
have to revert to Scoring when what I really need seems so fundamental.
Logically, what I want is a phrase filter factory that would match on
phrases listed in a file, like stopwords, but in this case index the match
and
CommonGrams is a tool for this. It makes is a into a token, but then
is and a are still removed as stopwords.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.CommonGramsFilterFactory
On 3/13/10, Christopher Ball christopher.b...@metaheuristica.com wrote:
Thank you for the idea
You can usually raise the header size limit by editing the config of
your servlet container. That can only get you so far though, and
different browsers have their own limits.
Your best bet, as Lance said, is either posting or sticking them in
solconfig.
You can post by using the
On Wed, Mar 3, 2010 at 7:51 AM, Marc Sturlese marc.sturl...@gmail.com wrote:
I am testing date facets in trunk with huge index. Aparently, as the default
solrconfig.xml shows, the fastest way to run dace facets queries is index
the field with this data type:
!-- A Trie based date field for
My response to this was mangled by my email client - sorry - hopefully
this one comes through a little easier to read ;)
On 03/09/2010 04:28 PM, Shawn Heisey wrote:
I attended the Webinar on March 4th. Many thanks to Yonik for putting
that on. That has led to some questions about the best
30 matches
Mail list logo