On Thu, Jan 27, 2011 at 1:25 AM, cyang2010 ysxsu...@hotmail.com wrote:
Is Field Collapsing a new feature for solr 4.0 (not yet released yet)?
That's at least what the Wiki tells you, yes.
Hi All
I want to integrate lucene Surround Query Parser with solr 1.4.1, and for that
I
am writing Custom Query Parser Plugin, To accomplish this task I should write a
sub class of org.apache.solr.search.QParserPlugin and implement its two
methods
public void init(NamedList nl)
public
Why is converting documents to utf-8 not feasible?
Nowadays any platform offers such services.
Can you give a detailed failure description (maybe with the URL to a sample
document you post)?
paul
Le 27 janv. 2011 à 07:31, prasad deshpande a écrit :
I am able to successfully index/search
Simone,
It's good that you did so! I had found this three days ago while googling.
And I am starting to make sense of it. It works well.
Two little comments:
- you are saying that it packages a standalone multicore and a standalone app.
But it actually also packs a webapp.
At first, I had
The size of docs can be huge, like suppose there are 800MB pdf file to index
it I need to translate it in UTF-8 and then send this file to index. Now
suppose there can be any number of clients who can upload file. at that time
it will affect performance. and already our product support
Hi all,
The query for standard request handler is as follows
field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
field5:(keyword5)
How the same above query can be written for dismax request handler
--
Thanks
At least in java utf-8 transcoding is done on a stream basis. No issue there.
paul
Le 27 janv. 2011 à 09:51, prasad deshpande a écrit :
The size of docs can be huge, like suppose there are 800MB pdf file to index
it I need to translate it in UTF-8 and then send this file to index. Now
The wiki page for the ExtractingRequestHandler says that I can add the
following configuration:
str name=tika.config/my/path/to/tika.config/str
I have tried to google for an example of such a Tika config file, but
haven't found anything.
Erlend
--
Erlend Garåsen
Center for Information
Hi
We are trying to post some PDF documents to solr for indexing using ASP.net
but cannot find any documentation or a library that will allow posting of
binary data.
Has anyone done this and if so, how?
Regards
Andrew McCombe
iWeb Solutions Ltd.
hi all. My query range for multivalued date field work incorrect.
My schema. There is field requestDate that have multivalued attr.:
fields
field name=id type=string indexed=true stored=true
required=true /
field name=keyword type=text indexed=true stored=true /
field name=count
use dismax q for first three fields and a filter query for the 4th and 5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
take a look at the dismax docs for
but q=keyword1 keyword2 does AND operation not OR
On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com wrote:
use dismax q for first three fields and a filter query for the 4th and 5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
Hi,
Is there a way to avoid duplicate content in a index at the moment i'm
uploading my xml feed via DIH?
I would like to have only one entry for a given description. I mean if
the desciption of one product already exist in index not import this new
product.
Is there a built in function?
the default operation can be set in your config to be or or on the query
something like q.op=OR
On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
but q=keyword1 keyword2 does AND operation not OR
On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
wrote:
The DisMax query parser internally hard-codes its operator to OR.
This is quite unlike the Lucene query parser, for which the default operator
can be configured using the solrQueryParser in schema.xml
Regards,
Bijeet Singh
On Thu, Jan 27, 2011 at 4:56 PM, Isan Fulia
sorry ignore that - we are on dismax here - look at mm param in the docs
you can set this to achieve what you need
On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com wrote:
the default operation can be set in your config to be or or on the query
something like q.op=OR
On 27
http://wiki.apache.org/solr/Deduplication
On Thursday 27 January 2011 12:32:29 Rosa (Anuncios) wrote:
Is there a way to avoid duplicate content in a index at the moment i'm
uploading my xml feed via DIH?
I would like to have only one entry for a given description. I mean if
the
Hi Paul,
thanks a lot for your feedbacks, much more than appreciated! :)
Going through your comments:
* Yes it also packs a Solr webepp, it is needed to embed it in
Tomcat. Do you think it could be a useful feature having also webapp
.war as output? if it helps, I'm open to add it as well.
*
Looks like you are connecting to Tomcat's AJP port, not the HTTP one.
Connect to the Tomcat HTTP port and I suspect you'll have greater
success.
Upayavira
On Wed, 26 Jan 2011 22:45 -0800, Darniz rnizamud...@edmunds.com
wrote:
Hello,
i uploaded solr.war file on my hosting provider and added
Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
thanks a lot for your feedbacks, much more than appreciated! :)
Good time sync. I need it right now.
* Yes it also packs a Solr webepp, it is needed to embed it in
Tomcat. Do you think it could be a useful feature having also webapp
.war as
Range queries work on multivalued fields. I suspect the date math
conversion is fooling you. For instance,NOW/HOUR first rounds down to
the current hour, *then* subtracts one hour.
If you attach debugQuery=on (or check the debug checkbox
in the admin full search page), you'll see the exact
It worked by making mm=0 (it acted as OR operator)
but how to handle this
field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))
On 27 January 2011 17:06, lee
Markus,
The problem here is if I call the below two URLs immediately after
replication then I am getting both the index versions as same. In my python
script I have added code to swap the online core on master with offline core
on master and online core on slave with offline core on slave, if
On Thu, Jan 27, 2011 at 3:44 PM, Andrew McCombe eupe...@gmail.com wrote:
Hi
We are trying to post some PDF documents to solr for indexing using ASP.net
but cannot find any documentation or a library that will allow posting of
binary data.
[...]
Do not have much idea of ASP.net, but SolrNet
(
with dismax you get to say things like match all terms if less then 3 terms
entered else match term-x
it produces highly flexible and relevant matches and works very well in lots
of common search usescases. field boosting
allows further tuning.
if you have rigid rules like the last one you quote
Hi,
Pretty novice into SOLR coding, but looking for hints about how (if not already
done) to implement a PatternTokenizer, that would index this into multivalie
fields of solr.StrField for facetting. Ex.
Water -- Irrigation ; Water -- Sewage
should be tokenized into
Water
Irrigation
Let's back up a moment and ask why you are doing this from scripts,
because this feels like an XY problem, see:
http://people.apache.org/~hossman/#xyproblem
http://people.apache.org/~hossman/#xyproblem
What are you trying to accomplish by swapping cores on the master
and slave?
Solr 1.4 has
What version of Solr are you using, and could you consider either 3x or
applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
full Lucene query language and probably works here. See the Solr
JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553
Best
Erick
On Thu, Jan
It may also be an option to mix the query parsers?
Something like this (not tested):
q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my dismax
-bad
So you have the benefits of lucene and dismax parser
-Ursprüngliche Nachricht-
Von: Erick Erickson
Hi,
is ther a way by which i could detect the out of memory errors in solr so
that i could implement some functionality such as restarting the tomcat or
alert me via email whenever such error is detected.?
--
View this message in context:
Any One
On Thu, Jan 27, 2011 at 1:27 PM, Ahson Iqbal mianah...@yahoo.com wrote:
Hi All
I want to integrate lucene Surround Query Parser with solr 1.4.1, and for
that I
am writing Custom Query Parser Plugin, To accomplish this task I should
write a
sub class of
Yes, you need to create both a QParserPlugin and a QParser implementation.
Look at Solr's own source code for the LuceneQParserPlugin/LuceneQParser and
built it like that.
Baking the surround query parser into Solr out of the box would be a useful
contribution, so if you care to give it a
Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
thanks a lot for your feedbacks, much more than appreciated! :)
One more anomaly I find: the license is in the output of the pom.xml.
I think this should not be the case.
*my* license should be there, not the license of the archetype. Or?
paul
I believe that as along as Tika is included in a folder that is
referenced by solrconfig.xml you should be good. Solr will
automatically throw mime types to Tika for parsing. Can anyone else
add to this?
Thanks,
Adam
On Thu, Jan 27, 2011 at 5:06 AM, Erlend Garåsen e.f.gara...@usit.uio.no wrote:
Hi Paul,
sorry I'm late but I've been in the middle of a conf call :( On which
IRC server the #solr channel is? I'll reach you ASAP.
Thanks a lot!
Simo
http://people.apache.org/~simonetripodi/
http://www.99soft.org/
On Thu, Jan 27, 2011 at 4:00 PM, Paul Libbrecht p...@hoplahup.net wrote:
Le
Simo, it's freenode.net
On Thu, Jan 27, 2011 at 4:16 PM, Simone Tripodi simonetrip...@apache.orgwrote:
Hi Paul,
sorry I'm late but I've been in the middle of a conf call :( On which
IRC server the #solr channel is? I'll reach you ASAP.
Thanks a lot!
Simo
Yes, I think nested queries are the only way to do that, and yes, nested
queries like Daniel's example work (I've done it myself). I haven't really
tried to get into understanding/demonstrating _exactly_ how the relevance ends
up working on the overall master query in such a situation, but it
If this configuration file is the same as the tika-mimetypes.xml file
inside Nutch' conf file, I have an example.
I was trying to implement language detection for Solr and thought I had
to invoke some Tika functionality by this configuration file in order to
do so, but found out that I
Tokenization is fine with facets, that caution is about, say, faceting
on the tokenized body of a document where you have potentially
a huge number of unique tokens.
But if there is a controlled number of distinct values, you shouldn't have
to do anything except index to a tokenized field. I'd
Beyond what Erick said, I'll add that it is often better to do this from the
outside and send in multiple actual end-user displayable facet values. When
you send in a field like Water -- Irrigation ; Water -- Sewage, that is what
will get stored (if you have it set to stored), but what you
Thanks for the hints!
Sorry about stealing the thread query range in multivalued date field
Mistakenly responded to it.
cheers,
:-Dennis
On 27/01/2011, at 16.48, Erik Hatcher wrote:
Beyond what Erick said, I'll add that it is often better to do this from the
outside and send in multiple
Hi,
Am getting the following messages while using EmbeddedSolr to retrieve
the Term Vectors. I also happened to go through
https://issues.apache.org/jira/browse/SOLR-914 . Should I ignore these
messages and proceed or should I make any changes?
Let me describe the question using an example:
If search Lee on name field as exact term match,
returning result can be:
Lee Jamie
Jamie Lee
Will solr grant higher score to Lee Jamie vs Jamie Lee based on the
position of the term in name field of each document?
From what i know, the score
Hi Cyang,
usually Solr isn't looking at the position of a term. However, there are
solutions out there for considering the term's position when calculating a
doc's score.
Furthermore: If two docs got the same score, I think they are ordered the
way they were found in the index.
Does this
Hi,
excuse me for pushing this for a second time, but I can't figure it out by
looking at the source code...
Thanks!
Hi Lance,
thanks for your explanation.
As far as I know in distributed search i have to tell Solr what other
shards it has to query. So, if I want to query a
I am using JMX to monitor my replication status and am finding that my
MBeans are disappearing. I turned on debugging for JMX and found that
solr seems to be deleting the mbeans.
Is this a bug? Some trace info is below..
here's me reading the mbean successfully:
Jan 27, 2011 5:00:02 PM
thanks exaclty i asked my domain hosting provider and he provided me with
some other port
i am wondering can i specify credentials without the port
i mean when i open the browser and i type
www.mydomainmame/solr i get the tomcat auth login screen.
in the same way can i configure the http
In general, patches are applied to the source tree and it's re-compiled.
See: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches
This is pretty easy, and I do know that some people have applied the
eDismax
patch to the 1.4 code line, but I haven't done it myself.
Best
Erick
On
Hi Em,
Thanks for reply.
Basically you are saying there is no builtin solution that care about the
position of the term to impact the relevancy score. In my scenario, i will
get those two document with the same score. The order depends on the
sequence of indexing.
Thanks,
Cyang
--
View
Just a little clarification, when i say position of the term, i mean the
position of the term within the field.
For example,
Jamie Lee -- Lee is the second position of the name field.
Lee Jamie -- Lee is the first position of the name field in this case.
--
View this message in context:
If I do
qt=dismax
fq=uid:1
(or any other positive number) then queries are as quick as normal - in
the 20ms range.
However, any of
fq=uid:\-1
or
fq=uid:[* TO -1]
or
fq=uid:[-1 to -1]
or
fq=-uid:[0 TO *]
then queries are incredibly slow - in the 9
On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said:
Are you sure you need CMS incremental mode? It's only adviced when running on
a machine with one or two processors. If you have more you should consider
disabling the incremental flags.
I'll test agin but we added those to get
: Subject: Import Handler for tokenizing facet string into multi-valued
: solr.StrField..
: In-Reply-To: 1296123345064-2361292.p...@n3.nabble.com
: References: 1296123345064-2361292.p...@n3.nabble.com
-Hoss
: Then for clean=false, my understanding is that it won't blow off existing
: index. For data that exist in index and db table (by the same uniqueKey)
: it will update the index data regardless if there is actual field update.
: For existing index data but not existing in table (by comparing
Hi,
Do we have data import handler to fast read in data from noSQL database,
specifically, MongoDB I am thinking to use?
Or a more general question, how does Solr work with noSQL database?
Thanks.
Jianbin
On Thu, Jan 27, 2011 at 11:32:26PM +, me said:
If I do
qt=dismax
fq=uid:1
(or any other positive number) then queries are as quick as normal - in
the 20ms range.
For what it's worth uid is a TrieIntField with precisionStep=0,
omitNorms=true, positionIncrementGap=0
This should help
HttpClient client = new HttpClient();
client.getParams().setAuthenticationPreemptive(true);
AuthScope scope = new AuthScope(AuthScope.ANY_HOST,AuthScope.ANY_PORT);
client.getState().setCredentials(scope, new
UsernamePasswordCredentials(user, password));
Regards,
Jayendra
The tika.config file is obsolete. I don't know what replaces it.
On 1/27/11, Erlend Garåsen e.f.gara...@usit.uio.no wrote:
If this configuration file is the same as the tika-mimetypes.xml file
inside Nutch' conf file, I have an example.
I was trying to implement language detection for Solr
There no special connectors available to read from the key-value
stores like memcache/cassandra/mongodb. You would have to get a Java
client library for the DB and code your own dataimporthandler
datasource. I cannot recommend this; you should make your own program
to read data and upload to Solr
Hello-
I have not used SolrCloud.
On 1/27/11, Em mailformailingli...@yahoo.de wrote:
Hi,
excuse me for pushing this for a second time, but I can't figure it out by
looking at the source code...
Thanks!
Hi Lance,
thanks for your explanation.
As far as I know in distributed search i
Hi all,
I am currently using solr1.4.1 .Do I need to apply patch for extended
dismax parser.
On 28 January 2011 03:42, Erick Erickson erickerick...@gmail.com wrote:
In general, patches are applied to the source tree and it's re-compiled.
See:
Why not make one's own DIH handler, Lance?
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is usually a
better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from
Do we have performance measurement? Would it be much slower compared to other
DIH?
There no special connectors available to read from the key-value
stores like memcache/cassandra/mongodb. You would have to get a Java
client library for the DB and code your own dataimporthandler
datasource.
i have a field in xml file DeviceTypeAccessory Data / Memory/DeviceType
solr schema field declared as field name=deviceType type=text
indexed=true stored=true /
I am trying to eliminate results by using NOT. For example I want all
devices for a term except where DeviceType is not Accessory*
--- On Fri, 1/28/11, abhayd ajdabhol...@hotmail.com wrote:
From: abhayd ajdabhol...@hotmail.com
Subject: NOT operator not working
To: solr-user@lucene.apache.org
Date: Friday, January 28, 2011, 8:45 AM
i have a field in xml file DeviceTypeAccessory Data
/ Memory/DeviceType
solr schema
On Fri, Jan 28, 2011 at 6:00 AM, Jianbin Dai j...@huawei.com wrote:
[...]
Do we have data import handler to fast read in data from noSQL database,
specifically, MongoDB I am thinking to use?
[...]
Have you tried the links that a Google search turns up? Some of
them look like pretty good
Hi,
no, you missunderstood me, I only said that Solr does not care of the
positions *usually*.
Lucene got SpanNearQuery which considers the position of the Query's terms
relative to eachother.
Furthermore there exists a SpanFirstQuery which boosts occurences of a Term
at the beginning of a
67 matches
Mail list logo