Actually, you may be able to get by using PatternReplaceCharFilterFactory -
copy the source value to two fields, one that treats d2.*/d2 as the
delimiter pattern to delete and then other uses d1.*/d1 as the
delimiter pattern to delete, so the first field has only d1 and then
second has only d2.
Hello,
We have a similar requirement where a large list of IDs needs to be sent to
SOLR in filter query.
Could someone please help understand if this feature is now supported in the
new versions of SOLR?
Thanks
--
View this message in context:
Shawn,
Thanks for the suggestion, but experimentally, in my case the same query with
facet.method=enum returns in almost the same amount of time.
Regards
David
On Tuesday, January 13, 2015 12:02 PM, Shawn Heisey apa...@elyograg.org
wrote:
On 1/13/2015 10:35 AM, David Smith wrote:
I have a query against a single 50M doc index (175GB) using Solr 4.10.2, that
exhibits the following response times (via the debugQuery option in Solr Admin):
process: {
time: 24709,
query: { time: 54 }, facet: { time: 24574 },
The query time of 54ms is great and exactly as expected -- this
The suggester is not working for me with Solr 4.10.2
Can anyone shed light over why I might be getting the exception below when
I build the dictionary?
response
lst name=responseHeader
int name=status500/int
int name=QTime26/int
/lst
lst name=error
str name=msglen must be = 32767; got 35680/str
Mikhail,
Thanks - it works now.The script transformer was really not needed, a
template transformer is clearer, and the log transformer is now working.
On Mon, Dec 8, 2014 at 1:56 AM, Mikhail Khludnev mkhlud...@griddynamics.com
wrote:
Hello Dan,
Usually it works well. Can you describe
Hi,
hl.usePhraseHighlighter is valid for standard highlighter. May be you are using
one of the other highlighters?
May be you have omitTermFreqAndPositions=true in definition of text_general
field type?
Ahmet
On Tuesday, January 13, 2015 5:52 PM, meena.sri...@mathworks.com
Thanks Michael and Hoss,
assuming I've written the subclass of the postings format, I need to tell
Solr to use it.
Do I just do something like:
fieldType name=ocr class=solr.TextField postingsFormat=MySubclass /
Is there a way to set this for all fieldtypes or would that require writing
a
I am having some trouble getting the suggester to work. The spell
requestHandler is working, but I didn't like the results I was getting from
the word breaking dictionary and turned them off.
So some basic questions:
- How can I check on the status of a dictionary?
- How can I see what is
On 1/13/2015 10:35 AM, David Smith wrote:
I have a query against a single 50M doc index (175GB) using Solr 4.10.2, that
exhibits the following response times (via the debugQuery option in Solr
Admin):
process: {
time: 24709,
query: { time: 54 }, facet: { time: 24574 },
The query time
Range Faceting won't use the DocValues even if they are there set, it
translates each gap to a filter. This means that it will end up using the
FilterCache, which should cause faster followup queries if you repeat the
same gaps (and don't commit).
You may also want to try interval faceting, it
Maybe I can use grouping, but my understanding of the feature is not up to
figuring that out :)
I tried something like
http://localhost:8983/solr/collection/select?q=childhood+cancergroup=ongroup.query=childhood+cancer
Because the group.limit=1, I get a single result, and no other results.
If I
I think you are probably getting bitten by one of the issues addressed
in LUCENE-5889
I would recommend against using buildOnCommit=true - with a large index
this can be a performance-killer. Instead, build the index yourself
using the Solr spellchecker support (spellcheck.build=true)
Just a side question. In your first example you have dates set with time
but in the second (where you set intervals) time is not set.
Is this something that can be resolved having a field that only sets date
(without time), and then use regular field faceting and facet.sort=index?
If that's
We are experiencing unexpected recovery events when a leader is sending
updates to a replica. A java.net.SocketException: Connection reset² is
encountered when updating the replica which triggers the recovery.
In our previous Solr 4.6.1 installation, update errors triggered retry
logic in the
Tomás,
Thanks for the response -- the performance of my query makes perfect sense in
light of your information.
I looked at Interval faceting. My required interval is 1 day. I cannot change
that requirement. Unless I am mis-reading the doc, that means to facet a 10
year range, the query
No, you are not misreading, right now there is no automatic way of
generating the intervals on the server side similar to range faceting... I
guess it won't work in your case. Maybe you should create a Jira to add
this feature to interval faceting.
Tomás
On Tue, Jan 13, 2015 at 10:44 AM, David
: assuming I've written the subclass of the postings format, I need to tell
: Solr to use it.
:
: Do I just do something like:
:
: fieldType name=ocr class=solr.TextField postingsFormat=MySubclass /
the postingFormat xml tag in schema.xml just refers to the name of the
postingFormat in SPI --
What is stumping me is that the search result has 3 hits, yet faceting those 3
hits takes 24 seconds. The documentation for facet.method=fc is quite explicit
about how Solr does faceting:
fc (stands for Field Cache) The facet counts are calculated by iterating over
documents that match the
fc, fcs and enum only apply for field faceting, not range faceting.
Tomás
On Tue, Jan 13, 2015 at 11:24 AM, David Smith dsmiths...@yahoo.com.invalid
wrote:
What is stumping me is that the search result has 3 hits, yet faceting
those 3 hits takes 24 seconds. The documentation for
bq: My question is for indexed=false, stored=true field..what is optimized way
to get unique values in such field.
There isn't any. To do this you'll have to read the doc from disk,
it'll be decompressed
along the way and then the field is read. Note that this happens
automatically when
you call
Thanks, we will supress it for now!
M.
-Original message-
From:Mark Miller markrmil...@gmail.com
Sent: Monday 12th January 2015 19:25
To: solr-user@lucene.apache.org
Subject: Re: Distributed unit tests and SSL doesn't have a valid keystore
I'd have to do some digging. Hossman
Could probably write a custom SearchComponent to prepend and expand
the query for the required use case. Though if something then has to
parse that query back, it would still be an issue.
Regards,
Alex
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 13 January
Related question -
I see mention of needing to rebuild the spellcheck/suggest dictionary after
solr core reload. I see spellcheckIndexDir in both the old wiki entry and
the solr reference guide
https://cwiki.apache.org/confluence/display/solr/Spell+Checking. If this
parameter is provided, it
On 1/13/2015 12:10 AM, ig01 wrote:
Unfortunately this is the case, we do have hundreds of millions of documents
on one
Solr instance/server. All our configs and schema are with default
configurations. Our index
size is 180G, does that mean that we need at least 180G heap size?
If you have
I get this error when starting Solr using the script in bin/solr
tail cannot open `[path]/logs/solr.log’ for reading: No such file or directory
It does not happen every time, but it does happen a lot. It sometimes clears up
after a while.
I have tried creating an empty file, but solr then just
Hi Mark,
we're currently at 4.10.2, update to 4.10.3 ist scheduled for tomorrow.
T
Am 12.01.15 um 17:30 schrieb Mark Miller:
bq. ClusterState says we are the leader, but locally we don't think so
Generally this is due to some bug. One bug that can lead to it was recently
fixed in 4.10.3 I
Dear Markus,
Unfortunately I can not use payload since I want to retrieve this score to
each user as a simple field alongside other fields. Unfortunately payload
does not provide that. Also I dont want to change the default similarity
method of Lucene, I just want to have this filed to do the
Thanks Jack for your advice. Can you please explain me little more, how it
works? From Apache Wiki it's not to clear for me. I can write some
javaScript code when i want filtering some data ? In this case i have
d1bla bla bla/d1 d2 bla bla bla /d2 d1bla bla bla /d1 and i want
filtering d2 bla bla
Hi to all from Istanbul, Turkey,
I can say that I'm a newbie in Solr Hadoop,
I’m trying to index XML files (ipod_other.xml from lucidworks’ example
files, converted into sequence file format), using SolrXMLIngestMapper jars.
I’ve modified the schema.xml file by making the necesssary addions
Is it important where your leader is? If you just want to minimize
leadership changes during rolling re-start, then you could restart in the
opposite order (S3, S2, S1). That would give only 1 transition, but the
end result would be a leader on S2 instead of S1 (not sure if that
important to you
Daniel Collins wrote
Is it important where your leader is? If you just want to minimize
leadership changes during rolling re-start, then you could restart in the
opposite order (S3, S2, S1). That would give only 1 transition, but the
end result would be a leader on S2 instead of S1 (not sure
*Schema :*
field name=tenant_pool type=text stored=true/
*Code :*
SolrQuery q = new SolrQuery().setQuery(*:*);
q.set(GroupParams.GROUP, true);
q.set(GroupParams.GROUP_FIELD, tenant_pool);
*Data :*
tenant_pool : Baroda Farms
tenant_pool : Ketty Farms
*Output coming :*
groupValue=Farms, docs=2
Thank you for your responses.
However, according to my tests, solr 4.10.3 doesn’t use server by default
anymore due to the removal of these lines in the bin/solr script.
# TODO: see SOLR-3619, need to support server or example
# depending on the version of Solr
if [ -e $SOLR_TIP/server/start.jar
That's your job. The easiest way is to do a copyField to a string field.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 7:33 AM, Naresh Yadav nyadav@gmail.com wrote:
*Schema :*
field name=tenant_pool type=text stored=true/
*Code :*
SolrQuery q = new SolrQuery().setQuery(*:*);
A function query or an update processor to create a separate field are
still your best options.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 4:18 AM, Ali Nazemian alinazem...@gmail.com wrote:
Dear Markus,
Unfortunately I can not use payload since I want to retrieve this score to
each user as a
Hi jack,
Thanks for replying, i am new to solr please guide me on this. I have many
such columns in my schema
so copy field will create lot of duplicate fields beside i do not need any
search on original field.
My usecase is i do not want any search on tenant_pool field thats why i
declared it
Hi all,
I am experiencing a problem in Solr SuggestComponent
Occasionally solr suggester component throws an error like
Solr failed:
{responseHeader:{status:500,QTime:1},error:{msg:suggester was
not built,trace:java.lang.IllegalStateException: suggester was not
built\n\tat
By any chance are you trying to start Solr as a different user when
this happens? I'm
wondering if there's a permissions issue here
Wild guess.
On Tue, Jan 13, 2015 at 12:37 AM, Graeme Pietersz gra...@pietersz.net wrote:
I get this error when starting Solr using the script in bin/solr
I decided to go for function query and implementing function query to read
term frequency for each document from index. Anyway I did not find any
tutorial which is matched my problem well. I really appreciate if somebody
could provide me some useful tutorial or example for this case.
Thank you
Highlighting does not highlight the whole Phrase, instead each word gets
highlighted.
I tried all the suggestions that was given, with no luck
These are my special setting I tried for phrase highlighting
hl.usePhraseHighlighter=true
hl.q=query
SolrCloud is intended to work in the rolling restart case...
Index size, segment counts, segment names can (and will)
be different on different replicas of the same shard without
anything being amiss. Commits (hard) happen at different
times across the replicas in a shard. Merging logic kicks in
Still not able to get my autoComplete component to work in a distributed
environment. Works fine on a non-distributed system. Also, on the distributed
system, if I include distrib=false, it works.
I have tried shards.qt and shards parameters, but they make no difference. I
should add, I am
Looks like you have an underlying JDBC problem. The socket representing
your database connection seems to be going away. Have you tried running
this query outside of Solr and iterating through all the results? How about
in a standalone Java program? Do you have a DBA you can consult to see if
Something is very wrong here. Have you perhaps been changing your
schema without re-indexing? And I recommend you completely remove
your data directory (the one with index and tlog subdirectories) after
you change your schema.xml file.
Because you're trying to group on a field that is _not_
Would it be sufficient for your user case to simply extract all the d1
into one field and all the d2 in another field? If so, the update
processor script would be very simple, simply matching all d1.*/d1
and copying them to a separate field value and same for d2.
If you want examples of script
On 1/12/2015 5:34 AM, Thomas Lamy wrote:
I found no big/unusual GC pauses in the Log (at least manually; I
found no free solution to analyze them that worked out of the box on a
headless debian wheezy box). Eventually i tried with -Xmx8G (was 64G
before) on one of the nodes, after checking
Erick, my schema is same no change in that..
*Schema :*
field name=tenant_pool type=text stored=true/
my guess is i had not mentioned indexed true or falsemay be default
indexed is true
My question is for indexed=false, stored=true field..what is optimized way
to get unique values in
TermsQueryParser I think is somewhat new. Have you tried that one?
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 13 January 2015 at 12:54, rashmy1
As insane as it sounds, I need to process all the results. No one document is
more or less important than another. Only a few hundred unique docs will
be sent to the client at any one time, but the users expect to page through
them all.
I don't expect sub-second performance for this task. I'm
On 1/13/2015 2:50 PM, brian4 wrote:
The problem is the jetty-util version included in the Solr build is 6.1.26,
but this particular package is from version 7+. Looks like it is a bug in
the build files for Solr.
I fixed it by downloading jetty 7 separately and manually adding
I have a complicated problem to solve, and I don't know enough about
lucene/solr to phrase the question properly. This is kind of a shot in the
dark. My requirement is to return search results always in completely
collapsed form, rolling up duplicates with a count. Duplicates are defined
by
Shawn,
I've been thinking along your lines, and continued to run tests through the
day. The results surprised me.
For my index, Solr range faceting time is most closely related to the total
number of documents in the index for the range specified. The number of
buckets in the range is a
: ...the nuts bolts of it is that the PostingFormat baseclass should take
: care of all the SPI name registration that you need based on what you
: pass to the super() construction ... allthough now that i think about it,
: i'm not sure how you'd go about specifying your own name for the
:
On 1/13/2015 11:44 AM, David Smith wrote:
I looked at Interval faceting. My required interval is 1 day. I cannot
change that requirement. Unless I am mis-reading the doc, that means to
facet a 10 year range, the query needs to specify over 3,600 intervals ??
I am very ignorant of how the
Do you have a sense of what your typical queries would look like? I mean,
maybe you wouldn't actually need to fetch more than a tiny fraction of
those million documents. Do you only need to determine the top 10 or 20 or
50 unique field value row sets, or do you need to determine ALL unique row
Sounds like:
https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
http://heliosearch.org/the-collapsingqparserplugin-solrs-new-high-performance-field-collapsing-postfilter/
The main issue is your multi-field criteria. So you may need to
extend/overwrite the comparison
The problem is the jetty-util version included in the Solr build is 6.1.26,
but this particular package is from version 7+. Looks like it is a bug in
the build files for Solr.
I fixed it by downloading jetty 7 separately and manually adding
jetty-util-7.6.16.v20140903.jar to the end of my
Thanks Hoss,
This is starting to sound pretty complicated. Are you saying this is not
doable with Solr 4.10?
...or at least: that's how it *should* work :) makes me a bit nervous
about trying this on my own.
Should I open a JIRA issue or am I probably the only person with a use case
for
You may also want to take a look at how AnalyticsQueries can be plugged in.
This won't show you how to do the implementation but it will show you how
you can plugin a custom collector.
http://heliosearch.org/solrs-new-analyticsquery-api/
http://heliosearch.org/solrs-mergestrategy/
Joel Bernstein
: This is starting to sound pretty complicated. Are you saying this is not
: doable with Solr 4.10?
it should be doable in 4.10, using a wrapper class like the one i
mentioned below (delegating to Lucene51PostingsFormat instead of
Lucene50PostingsFormat) ... it's just that the 4.10 APIs are
61 matches
Mail list logo