I should beef up that spans extractor - it can actually work on the
constantscore multi term queries (the base ones that now have a constant
score mode in 2.9), just like the Highlighter does. That class really
belongs in contrib probably.
You can use the filter and the spanquery to get the result
Friendly Reminder! One week to go.
On Mon, Sep 14, 2009 at 11:35 AM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:
> Greetings,
>
> It's time for another Hadoop/Lucene/Apache"Cloud" Stack meetup!
> This month it'll be on Wednesday, the 30th, at 6:45 pm.
>
> We should have a few interest
Has anyone received a link with the slides from the presentation yet?
-Mike
On Fri, Sep 18, 2009 at 3:56 PM, Erik Hatcher wrote:
> Free Webinar: Apache Lucene 2.9: Discover the Powerful New Features
> ---
>
> Join us for a free an
Hi Joel,
Couple of quick points.
1. The metric for indexing only.
2. It is 1000 docs/minute (sorry for the earlier 1000/sec goof up)
3. Regarding search/query, it depends on many parameters... (similarity,
proximity, synonym look up etc.)
Sincerely,
Sithu D Sudarsan
-Original Message-
thanks for the tip.
I don't see a way to integrate the QueryWrapperFilter (or any Filter) into
SpanTermQuery.getSpans(indexReader) however.
I can use a SpanQuery with an IndexSearcher as per susual but that leaves me
back where I started. Any thoughts?
Also, I will need to sort these results by
Hello List members,
Please help me to fix a problem in my DictionaryFilter class. It is
used to map acronyms, abbreviations, synonyms, etc. to one common root
word/phrase for easy searching. For example, "temp" is an abbreviation
for "temperature". One-to-one substitutions work without probl
: But alas, I cannot seem to get access to any TermPositions from my above
: BooleanQuery.
I would suggest refactoring your "date" restriction into a Filter (there's
fairly easy to use Filter that wraps a Query) and then execute a
SPanTermQuery just as you describe.
-Hoss
--
Hi
Sorry for not getting back to you. Been swamped with stuff and work and
home. Just managed to check my lucene emails!
You are right i made some silly mistakes with the testcase and have updated
accordingly. The test is still failing but the properties are set
correctly:
public class Underwr
First, really think about getting a copy of Luke to help you investigate
what'sactually in your index, it's invaluable. It'll also let you try
running queries
through different analyzers and seeing the results.
But I think you're a bit fuzzy on what analyzers do. Their primary purpose
is to break
Hello,
I've been searching the forum and found several more or less relevant
topic listed below.
http://www.nabble.com/Parsing-text-containing-forward-slash-and-wildcard-td13541503.html#a13541503
http://www.nabble.com/Parsing-text-containing-forward-slash-and-wildcard-td13541503.ht
Hi Joel,
With approx. 100K doc size, on dual-quad core machine, (3.0Ghz) -
Windows platform, we have an average 1000 docs/sec. This includes text
extraction from PDF docs.
Hope this helps.
Sincerely,
Sithu D Sudarsan
-Original Message-
From: Joel Halbert [mailto:j...@su3analytics.co
Hello,
I have indexed documents with two fields, "ARTICLE" for an article of text
and "PUB_DATE" for the article's publication date.
Given a specific single word, I want to search my index for all documents
that contain this word within the last two weeks, and have them sorted by
date:
TermQuery
I found this thread pretty useful:
http://markmail.org/search/?q=Re%3A+Scaling+out%2Fup+or+a+mix#query:Re%
3A%20Scaling%20out%2Fup%20or%20a%20mix+page:1+mid:x4ymuplegomuth7n
+state:results
-Original Message-
From: Erick Erickson
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.
It's really hard to say anything meaningful here. How many fields? Whatkind
of sorting to you intend to do? How complex are the queries you
expect?
And even if you have meaningful answers to the above,
then "it depends" (tm).
Then you could go to SOLR (which is built on Lucene) to handle
distribu
Hi,
Does anyone know of any recent metrics & stats on building out an index
of ~100mm documents (each doc approx 5k). I'm looking for approx stats
on time to build, time to query and infrastructure requirements (number
of machines & spec) to reasonably support an index of such a size.
Thanks,
J
15 matches
Mail list logo