Hi - it seems the analysis page is broken on trunk and it looks like our 4.5
and 4.6 builds are unaffected. Can anyone on trunk confirm this?
Markus
@lucene.apache.org
Subject: Re: Analysis page broken on trunk?
Hey Markus
i'm not up to date with the latest changes, but if you can describe how to
reproduce it, i can try to verify that?
-Stefan
On Wednesday, January 8, 2014 at 12:44 PM, Markus Jelsma wrote:
Hi - it seems
Check the bytes property:
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/util/BytesRef.html#bytes
@Override
public float scorePayload(int doc, int start, int end, BytesRef payload) {
if (payload != null) {
return PayloadHelper.decodeFloat(payload.bytes);
}
return
fields w/
values from the example docs and that looks pretty okay to me, no change
noticed on that.
Can you share a screenshot or something like that? And perhaps Input,
Fields/Fieldtype which doesn't work for you?
-Stefan
On Wednesday, January 8, 2014 at 2:24 PM, Markus Jelsma
the example docs and that looks pretty okay to me, no change
noticed on that.
Can you share a screenshot or something like that? And perhaps Input,
Fields/Fieldtype which doesn't work for you?
-Stefan
On Wednesday, January 8, 2014 at 2:24 PM, Markus Jelsma wrote:
Hi - You will see
Strange, is it really floats you are inserting as payload? We use payloads too
but write them via PayloadAttribute in custom token filters as float.
-Original message-
From:michael.boom my_sky...@yahoo.com
Sent: Tuesday 14th January 2014 11:59
To: solr-user@lucene.apache.org
in the LinkDB together with the index-anchor plugin
to write the anchor field in your Solrindex.
Any help is appreciated! Thanks!
Markus Jelsma Wrote:
You need to use the invertlinks command to build a database with docs with
inlinks and anchors. Then use the index-anchor plugin when
to be created?
I really appreciate the help! Thank you very much!
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: Thursday, January 16, 2014 5:45 AM
To: solr-user@lucene.apache.org
Subject: RE: Indexing URLs from websites
-Original message
not exist: file:/.../crawl/linkdb/parse_text
Along with a Java stacktrace
Those linkdb folders are not being created.
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: Thursday, January 16, 2014 10:44 AM
To: solr-user@lucene.apache.org
Subject: RE
of the link database, keeping only the highest quality
links.
/description
/property
So change the property, rebuild the linkdb and try reindexing once again :)
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: Thursday, January 16, 2014 11:08 AM
direction on this?
Thank you so much for sticking with me on this - I really appreciate your
help!
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: Friday, January 17, 2014 6:46 AM
To: solr-user@lucene.apache.org
Subject: RE: Indexing URLs from websites
Hi - We use Nginx to expose the index to the internet. It comes down to putting
some limitations on input parameters and on-the-fly rewrite of queries using
embedded Perl scripting. Limitations and rewrites are usually just a bunch of
regular expressions, so it is not that hard.
Cheers
Markus
, and
/documents/Article 1.pdf
How can I get these URLs?
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: Monday, January 20, 2014 9:08 AM
To: solr-user@lucene.apache.org
Subject: RE: Indexing URLs from websites
Well it is hard to get a specific anchor because
Hi - this likely belongs to an existing open issue. We're seeing the stuff
below on a build of the 22nd. Until just now we used builds of the 20th and
didn't have the issue. This is either a bug or did some data format in
Zookeeper change? Until now only two cores of the same shard through the
22nd January 2014 18:56
To: solr-user solr-user@lucene.apache.org
Subject: Re: AIOOBException on trunk since 21st or 22nd build
Looking at the list of changes on the 21st and 22nd, I don’t see a smoking
gun.
- Mark
On Jan 22, 2014, 11:13:26 AM, Markus Jelsma markus.jel
On Jan 22, 2014, 11:13:26 AM, Markus Jelsma markus.jel...@openindex.io
wrote: Hi - this likely belongs to an existing open issue. We're seeing the
stuff below on a build of the 22nd. Until just now we used builds of the
20th and didn't have the issue. This is either a bug or did some data
Query Recommendations using Query Logs in Search Engines
http://personales.dcc.uchile.cl/~churtado/clustwebLNCS.pdf
Very interesting paper and section 2.1 covers related work plus references.
In our first attempt we did it even simpler, by finding for each query other
top queries by inspecting
Short answer, you can't.rashmi maheshwari maheshwari.ras...@gmail.com
schreef:Thanks All for quick response.
Today I crawled a webpage using nutch. This page have many links. But all
anchor tags have href=# and javascript is written on onClick event of
each anchor tag to open a new page.
So
Hi,
We have a developement environment running trunk but have custom analyzers and
token filters built on 4.6.1. Now the constructors have changes somewhat and
stuff breaks. Here's a consumer trying to get a TokenStream from an Analyzer
object doing TokenStream stream =
Boundary scanner using Java's break iterator:
http://wiki.apache.org/solr/HighlightingParameters#hl.boundaryScanner
-Original message-
From:Furkan KAMACI furkankam...@gmail.com
Sent: Tuesday 4th February 2014 12:03
To: solr-user@lucene.apache.org
Subject: Sentence Detection for
Yes, that issue is fixed. We are on trunk and seeing it happen again. Kill some
nodes when indexing, trigger OOM or reload the collection and you are in
trouble again.
-Original message-
From:Yago Riveiro yago.rive...@gmail.com
Sent: Monday 24th February 2014 14:54
To:
Something must be eating your memory in your solrcloud indexer in Nutch. We
have our own SolrCloud indexer in Nutch and it uses extremely little memory.
You either have a leak or your batch size is too large.
-Original message-
From:Furkan KAMACI furkankam...@gmail.com
Sent:
You are not escaping the Lucene query parser special characters:
+ - || ! ( ) { } [ ] ^ ~ * ? : \ /
-Original message-
From:Furkan KAMACI furkankam...@gmail.com
Sent: Tuesday 4th March 2014 16:57
To: solr-user@lucene.apache.org
Subject: Id As URL for Solrj
Hi;
This maybe
Hi Steve - it seems most similarities use CollectionStatistics.maxDoc() in
idfExplain but there's also a docCount(). We use docCount in all our custom
similarities, also because it allows you to have multiple languages in one
index where one is much larger than the other. The small language
for the number of docs as
this won't change dramatically often..
steve
On Wed, Mar 12, 2014 at 11:18 AM, Markus Jelsma
markus.jel...@openindex.iowrote:
Hi Steve - it seems most similarities use CollectionStatistics.maxDoc() in
idfExplain but there's also a docCount(). We use docCount
Hi - as far as i know it has never been a good idea to run Lucene on OpenJDK 6
at all. Only either Oracle Java 6 or higher or OpenJDK 7.
On Wednesday, March 26, 2014 06:54:41 PM Nigel Sheridan-Smith wrote:
Hi all,
This is a bit of a 'heads up'. We have recently come across this bug on
Yes, override tfidfsimilarity and emit 1f in tf(). You can also use bm25 with
k1 set to zero in your schema.
Walter Underwood wun...@wunderwood.org schreef:And here is another
peculiarity of short text fields.
The movie New York, New York should not be twice as relevant for the query
new
Yes, that will work. And combined with your other question scores will always
be equal even if cinderella or chuck occur more than once in one document.
Walter Underwood wun...@wunderwood.org schreef:Just double-checking my
understanding of omitNorms.
For very short text fields like personal
Also, if i remember correctly, k1 set to zero for bm25 automatically omits
norms in the calculation. So thats easy to play with without reindexing.
Markus Jelsma markus.jel...@openindex.io schreef:Yes, override
tfidfsimilarity and emit 1f in tf(). You can also use bm25 with k1 set to zero
You may want to increase reclaimdeletesweight for tieredmergepolicy from 2 to 3
or 4. By default it may keep too much deleted or updated docs in the index.
This can increase index size by 50%!! Dmitry Kan solrexp...@gmail.com
schreef:Elisabeth,
Yes, I believe you are right in that the deletes
On Apr 1, 2014, at 12:30 PM, Markus Jelsma markus.jel...@openindex.io
wrote:
Also, if i remember correctly, k1 set to zero for bm25 automatically
omits norms in the calculation. So thats easy to play with without
reindexing.
Markus Jelsma markus.jel...@openindex.io schreef:Yes
Hi - the thing you describe is possible when your set up uses SpanFirstQuery.
But to be sure what's going on you should post the debug output.
-Original message-
From:John Nielsen j...@mcb.dk
Sent: Tuesday 8th April 2014 11:03
To: solr-user@lucene.apache.org
Subject: Strange
Well, this is somewhat of a problem if you have have URL's as uniqueKey that
contain exclamation marks. Isn't it an idea to allow those to be escaped and
thus ignored by CompositeIdRouter?
On Friday, April 11, 2014 11:43:31 AM Cool Techi wrote:
Thanks, that was helpful.
Regards,Rohit
This may help a bit:
https://wiki.apache.org/solr/PublicServers
-Original message-
From:Olivier Austina olivier.aust...@gmail.com
Sent:Thu 17-04-2014 18:16
Subject:Topology of Solr use
To:solr-user@lucene.apache.org;
Hi All,
I would to have an idea about Solr usage: number of users,
Hi, replicating full features search engine behaviour is not going to work with
nutch and solr out of the box. You are missing a thousand features such as
proper main content extraction, deduplication, classification of content and
hub or link pages, and much more. These things are possible to
Hello michael, you are not on lucene 4.8?
https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-5111
Michael Sokolov msoko...@safaribooksonline.com schreef:For posterity, in case
anybody follows this thread, I tracked the
problem down to WordDelimiterFilter; apparently it creates
Elisabeth, i think you are looking for SOLR-3211 that introduced
spellcheck.collateParam.* to override e.g. dismax settings.
Markus
-Original message-
From:elisabeth benoit elisaelisael...@gmail.com
Sent:Wed 14-05-2014 14:01
Subject:permissive mm value and efficient spellchecking
Hi Harsh,
Does SPDY provide lower latency than HTTP/1.1 with KeepAlive or is it
encryption that you're after?
Markus
-Original message-
From:harspras prasadta...@outlook.com
Sent:Tue 13-05-2014 05:38
Subject:Re: Solr + SPDY
To:solr-user@lucene.apache.org;
Hi Vinay,
I have been
http://wiki.apache.org/solr/ExtendedDisMax#Query_Syntax
-Original message-
From:michael.boom my_sky...@yahoo.com
Sent:Tue 10-06-2014 13:15
Subject:Edismax should, should not, exact match operators
To:solr-user@lucene.apache.org;
On google a user can query using operators like + or - and
Yes, always use three or a higher odd number of machines. It is best to have
them on dedicated machines and unless the cluster is very large three small VPS
machines with 512 MB RAM suffice.
-Original message-
From:Gili Nachum gilinac...@gmail.com
Sent:Tue 10-06-2014 08:58
Hi - did you perhaps update on of those documents?
-Original message-
From:Apoorva Gaurav apoorva.gau...@myntra.com
Sent: Tuesday 17th June 2014 16:58
To: solr-user@lucene.apache.org
Subject: docFreq coming to be more than 1 for unique id field
Hello All,
We are using solr
Yes, it is unique but they are not immediately purged, only when `optimized` or
forceMerge or during regular segment merges. The problem is that they keep
messing with the statistics.
-Original message-
From:Apoorva Gaurav apoorva.gau...@myntra.com
Sent: Tuesday 17th June 2014 17:16
Hi - remove the lock file in your solr/collection_name/data/index.*/
directory.
Markus
On Thursday, June 19, 2014 04:10:51 AM atp wrote:
Hi experts,
i have cnfigured solrcloud, on three machines , zookeeper started with no
errors, tomcat log also no errors , solr log alos no errors
-Original message-
From:johnmu...@aol.com johnmu...@aol.com
Sent: Wednesday 25th June 2014 20:13
To: solr-user@lucene.apache.org
Subject: How much free disk space will I need to optimize my index
Hi,
I need to de-fragment my index. My question is, how much free disk
(Too many open files)
Try raising the limit from probably 1024 to 4k-16k orso.
-Original message-
From:Niklas Langvig niklas.lang...@globesoft.com
Sent: Monday 30th June 2014 17:09
To: solr-user@lucene.apache.org
Subject: unable to start solr instance
Hello,
We havet o solr
Hi, i don't think this is ever going to work with the MLT Handler, you should
use the regular SearchHandler instead.
-Original message-
From:SafeJava T t...@safejava.com
Sent: Monday 30th June 2014 17:52
To: solr-user@lucene.apache.org
Subject: NPE when using facets with the MLT
Hi, you can safely ignore this, it is shutting down anyway. Just don't reload
the app a lot of times without actually restarting Tomcat.
-Original message-
From:Aman Tandon amantandon...@gmail.com
Sent: Wednesday 2nd July 2014 7:22
To: solr-user@lucene.apache.org
Subject: Memory
Hi, you can escape the surrounding slashes in your front-end.
Markus
-Original message-
From:Markus Schuch markus_sch...@web.de
Sent: Thursday 3rd July 2014 20:53
To: solr-user@lucene.apache.org
Subject: Disable Regular Expression Support
Hi Solr Community,
we migrate from
Hahaha thanks wunder, made me laugh!
-Original message-
From:Walter Underwood wun...@wunderwood.org
Sent: Thursday 24th July 2014 2:07
To: solr-user@lucene.apache.org
Subject: Re: Any Solr consultants available??
When I see job postings like this, I have to assume they were
Hi - use the domain URL filter plugin and list the domains, hosts or TLD's you
want to restrict the crawl to.
-Original message-
From:Vivekanand Ittigi vi...@biginfolabs.com
Sent: Tuesday 29th July 2014 7:17
To: solr-user@lucene.apache.org
Subject: crawling all links of same
Don't use N-grams at query time.
-Original message-
From:prem1980 prem1...@gmail.com
Sent: Monday 4th August 2014 17:47
To: solr-user@lucene.apache.org
Subject: Solr substring search yields all indexed results
To do a substring search, I have added a new fieldType - Text with
All tokens produced have still have the same position as their initial
position, so no.
-Original message-
From:Johannes Siegert johannes.sieg...@marktjagd.de
Sent: Friday 8th August 2014 11:11
To: solr-user@lucene.apache.org
Subject: NGramTokenizer influence to length
Hi - You are running mapred jobs on the same nodes as Solr runs right? The
first thing i would think of is that your OS file buffer cache is abused. The
mappers read all data, presumably residing on the same node. The mapper output
and shuffling part would take place on the same node, only the
Yeah, very cool. Since this is all just client side, how about integrating it
in Solr's UI?
Also, it seems to assume `id` is the ID field, which is not always true.
-Original message-
From:david.w.smi...@gmail.com david.w.smi...@gmail.com
Sent: Friday 22nd August 2014 19:42
To:
Hi - You can already achieve this by boosting on the document's recency. The
result set won't be exactly ordered by date but you will get the most relevant
and recent documents on top.
Markus
-Original message-
From:Ravi Solr ravis...@gmail.com mailto:ravis...@gmail.com
Sent:
Yes, this is a nasty error. You have not set up logging libraries properly:
https://cwiki.apache.org/confluence/display/solr/Configuring+Logging
-Original message-
From:phi...@free.fr phi...@free.fr
Sent: Wednesday 17th September 2014 11:51
To: solr-user@lucene.apache.org
Subject:
Hi - but this makes no sense, they are scored as equals, except for tiny
differences in TF and IDF. What you would need is something like a stemmer that
preserves the original token and gives a 1 payload to the stemmed token. The
same goes for filters like decompounders and accent folders that
Hi - most filters should be used both sides, especially stemmers, accent
foldings and obviously lowercasing. Synonyms only on one side, depending on how
you want to utilize them.
Markus
-Original message-
From:eShard zim...@yahoo.com
Sent: Thursday 25th September 2014 22:23
To:
Yes, it appeared in 4.8 but you could use PatternReplaceFilterFactory to
simulate the same behavior.
Markus
-Original message-
From:PeterKerk petervdk...@hotmail.com
Sent: Monday 29th September 2014 21:08
To: solr-user@lucene.apache.org
Subject: Re: Flexible search field
Hi - you need to use function queries via the bf parameter. The function
exists() and in some cases query() will do the conditional work, depending on
your use case.
Markus
-Original message-
From:Shamik Bandopadhyay sham...@gmail.com
Sent: Monday 29th September 2014 21:30
To:
Hi - check the def() and if() functions, they can have embedded functions such
as exists() and query(). You can use those to apply the main query the the
productline field if author has some value. I cannot give a concrete example
because i don't have an environment to fiddle around with. If
Hi - you don't need to erase the data directory, you can just reindex, but make
sure you overwrite all documents.
-Original message-
From:Wayne W waynemailingli...@gmail.com
Sent: Friday 3rd October 2014 11:55
To: solr-user@lucene.apache.org
Subject: If I can a field from text_ws
Hi - you are probably using the WhitespaceTokenizer without a
WordDelimiterFilter. Consider using the StandardTokenizer or add the
WordDelimiterFilter.
Markus
-Original message-
From:EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
external.ravi.tamin...@us.bosch.com
Hi - you should not use wild cards for autocompletion, Lucene has far better
tools for making very good autocompletion, also, since a wild card is a multi
term query, they are not passed through your configured query time analyzer.
Some other comments:
- you use a porter stemmer but you should
Hi - yes it is worth a ticket as the javadoc says it is ok:
http://lucene.apache.org/solr/4_10_1/solr-core/org/apache/solr/schema/ExternalFileField.html
-Original message-
From:Matthew Nigl matthew.n...@gmail.com
Sent: Wednesday 8th October 2014 14:48
To: solr-user@lucene.apache.org
Hi,
For some crazy reason, some users somehow manage to substitute a perfectly
normal space with a badly encoded non-breaking space, properly URL encoded this
then becomes %c2a0 and depending on the encoding you use to view you probably
see  followed by a space. For example:
Because c2a0 is
/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
On 8 October 2014 09:59, Markus Jelsma markus.jel...@openindex.io wrote:
Hi,
For some crazy reason, some users somehow manage
Hi - no you don't have to, although maybe if you changed on how norms are
encoded.
Markus
-Original message-
From:elisabeth benoit elisaelisael...@gmail.com
Sent: Thursday 9th October 2014 12:26
To: solr-user@lucene.apache.org
Subject: does one need to reindex when changing
Hi - it should work, not seeing your implemenation in the debug output is a
known issue.
-Original message-
From:elisabeth benoit elisaelisael...@gmail.com
Sent: Thursday 9th October 2014 12:22
To: solr-user@lucene.apache.org
Subject: per field similarity not working with solr
with solr 4.2.1
Thanks for the information!
I've been struggling with that debug output. Any other way to know for sure
my similarity class is being used?
Thanks again,
Elisabeth
2014-10-09 13:03 GMT+02:00 Markus Jelsma markus.jel...@openindex.io:
Hi - it should work, not seeing your
participate in indexing.
-- Jack Krupansky
-Original Message-
From: Markus Jelsma
Sent: Thursday, October 9, 2014 6:59 AM
To: solr-user@lucene.apache.org
Subject: RE: does one need to reindex when changing similarity class
Hi - no you don't have to, although maybe if you
And don't forget to set the proper permissions on the script, the tomcat or
jetty user.
Markus
On Tuesday 14 October 2014 13:47:47 Boogie Shafer wrote:
a really simple approach is to have the OOM generate an email
e.g.
1) create a simple script (call it java_oom.sh) and drop it in your
This will do:
kill -9 `ps aux | grep -v grep | grep tomcat6 | awk '{print $2}'`
pkill should also work
On Tuesday 14 October 2014 07:02:03 Yago Riveiro wrote:
Boogie,
Any example for java_error.sh script?
—
/Yago Riveiro
On Tue, Oct 14, 2014 at 2:48 PM, Boogie Shafer
You either need to upload them and issue the reload command, or download them
from the machine, and then issue the reload command. There is no REST support
for it (yet) like the synonym filter, or was it stop filter?
MArkus
-Original message-
From:Michael Sokolov
You do not want stopwords in your shingles? Then put the stopword filter on top
of the shingle filter.
Markus
-Original message-
From:O. Klein kl...@octoweb.nl
Sent: Monday 27th October 2014 13:56
To: solr-user@lucene.apache.org
Subject: Stopwords in shingles suggester
Is there a
It is an ancient issue. One of the major contributors to the issue was resolved
some versions ago but we are still seeing it sometimes too, there is nothing to
see in the logs. We ignore it and just reindex.
-Original message-
From:S.L simpleliving...@gmail.com
Sent: Monday 27th
-threaded indexing and SolrCloud 4.10.1 replicas out
of synch.
I'm curious, could you elaborate on the issue and the partial fix?
Thanks!
On 10/27/14 11:31, Markus Jelsma wrote:
It is an ancient issue. One of the major contributors to the issue was
resolved some versions ago but we
goes to, because of huge amount of discrepancy between
the replicas.
Thank you for confirming that it is a know issue , I was thinking I was the
only one facing this due to my set up.
On Mon, Oct 27, 2014 at 11:31 AM, Markus Jelsma markus.jel...@openindex.io
wrote:
It is an ancient
On Tuesday 28 October 2014 10:42:11 Bernd Fehling wrote:
Thanks for the explanations.
My idea about 4 zookeepers is a result of having the same software
(java, zookeeper, solr, ...) installed on all 4 servers.
But yes, I don't need to start a zookeeper on the 4th server.
3 other machines
Hi - sure you can, using the frange parser as a filter:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-FunctionRangeQueryParser
http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/search/FunctionRangeQParserPlugin.html
But this is very much not
Either use the MaxScoreQueryParser [1] or set tie to zero when using a DisMax
parser.
[1]:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-MaxScoreQueryParser
-Original message-
From:Burgmans, Tom tom.burgm...@wolterskluwer.com
Sent: Tuesday 3rd
We recently upgraded our cloud from 4.8 to 4.10.3, the only config we updated
was the luceneMatchVersion. Response times were very stable prior to the
upgrade, but are quite erratic since the upgrade, and rising. I still have to
check all the resolved issues but something went very wrong
Hi - MoreLikeThis is not based on cosine similarity. The idea is that rare
terms - high IDF - are extracted from the source document, and then used to
build a regular Query(). That query follows the same rules as regular queries,
the rules of your similarity implemenation, which is TFIDF by
Hi - you can use the MLT query parser in Solr 5.0 or patch 4.10.x
https://issues.apache.org/jira/browse/SOLR-6248
-Original message-
From:Tim Hearn timseman...@gmail.com
Sent: Saturday 31st January 2015 0:31
To: solr-user@lucene.apache.org
Subject: Hit Highlighting and More Like
From memory: there are different methods in SolrIndexSearcher for reason. It
has to do with paging and sorting. Whenever you sort on a simple field, you
can easily start at a specific offset. The problem with sorting on score, is
that score has to be calculated for all documents matching query.
Well, maxqt is easy, it is just the number of terms that compose your query.
MinTF is a strange parameter, rare terms have a low DF and most usually not a
high TF, so i would keep it at 1. MinDF is more useful, it depends entirely on
the size of your corpus. If you have a lot of user-generated
similar documents by score threshold. Please correct me if I am wrong.
Thank you very much.
Regards.
On Feb 3, 2015 7:00 PM, Markus Jelsma markus.jel...@openindex.io
wrote:
Hi - sure you can, using the frange parser as a filter:
https://cwiki.apache.org/confluence/display/solr
. In this case I think it is reasonable to filter
similar documents by score threshold. Please correct me if I am wrong.
Thank you very much.
Regards.
On Feb 3, 2015 7:00 PM, Markus Jelsma markus.jel...@openindex.io wrote:
Hi - sure you can, using the frange parser as a filter:
https
Tika 1.6 has PDFBox 1.8.4, which has memory issues, eating excessive RAM!
Either upgrade to Tika 1.7 (out now) or manually use the PDFBox 1.8.8
dependency.
M.
On Friday 16 January 2015 15:21:55 Charlie Hull wrote:
On 16/01/2015 04:02, Dan Davis wrote:
Why re-write all the document
There are no dictionaries that sum up all possible conjugations, using a
heuristics based normalizer would be more appropriate. There are nevertheless
some good sources to start:
Contains lots of useful spelling issues, incl.
british/american/canadian/australian
http://grammarist.com/spelling
We have seen an increase between 4.8.1 and 4.10.
-Original message-
From:Dmitry Kan solrexp...@gmail.com
Sent: Tuesday 17th February 2015 11:06
To: solr-user@lucene.apache.org
Subject: unusually high 4.10.2 vs 4.3.1 RAM consumption
Hi,
We are currently comparing the RAM
17, 2015 at 12:12 PM, Markus Jelsma markus.jel...@openindex.io
wrote:
We have seen an increase between 4.8.1 and 4.10.
-Original message-
From:Dmitry Kan solrexp...@gmail.com
Sent: Tuesday 17th February 2015 11:06
To: solr-user@lucene.apache.org
Subject: unusually high
Hi - in a small Maven project depending on Solr 4.10.3, running unit tests that
extend BaseDistributedSearchTestCase randomly fail with SSL doesn't have a
valid keystore, and a lot of zombie threads. We have a solrtest.keystore file
laying around, but where to put it?
Thanks,
Markus
Hi - You mention having a list with important terms, then using payloads would
be the most straightforward i suppose. You still need a custom similarity and
custom query parser. Payloads work for us very well.
M
-Original message-
From:Ahmet Arslan iori...@yahoo.com.INVALID
Sent:
might know offhand. You might just
want to use @SupressSSL on the tests :)
- Mark
On Mon Jan 12 2015 at 8:45:11 AM Markus Jelsma markus.jel...@openindex.io
wrote:
Hi - in a small Maven project depending on Solr 4.10.3, running unit tests
that extend BaseDistributedSearchTestCase randomly
You can split into all groups by specifying group=-1.
-Original message-
From:Nivedita nivedita.pa...@tcs.com
Sent: Monday 9th February 2015 12:08
To: solr-user@lucene.apache.org
Subject: multiple patterns in solr.PatternTokenizerFactory
Can I give multiple patterns in
Well, the CHANGES.txt is filled with just the right information you need :)
-Original message-
From:Elan Palani elan.pal...@kaybus.com
Sent: Tuesday 10th February 2015 22:30
To: solr-user@lucene.apache.org
Subject: Upgrading Solr 4.7.2 to 4.10.3
Team..
Planning to Upgrade
Hello - setting (e)dismax' tie breaker to 0 or much low than default would
`solve` this for now.
Markus
-Original message-
From:Mihran Shahinian slowmih...@gmail.com
Sent: Monday 16th March 2015 16:29
To: solr-user@lucene.apache.org
Subject: Relevancy : Keyword stuffing
Hi all,
Hello - Chris' suggestion is indeed a good one but it can be tricky to properly
configure the parameters. Regarding position information, you can override
dismax to have it use SpanFirstQuery. It allows for setting strict boundaries
from the front of the document to a given position. You can
Anshum, Jack - don't any of you have a cluster at hand to get some real results
on this? After testing the actual functionality for a quite some time while the
final patch was in development, we have not had the change to work on
performance tests. We are still on Solr 4.10 and have to port
901 - 1000 of 1541 matches
Mail list logo