Hi Russell,
Seems that the error messages says that the implementing class for
OffsetAttribute
cannot be found in your classpath on the (Pig?) environment.
There seems to be implementing classes OffsetAttributeImpl and Token, according
to Javadoc:
http://lucene.apache.org/core/4_6_0/core/org/a
Oh that's good to hear. Lucene's unit tests are quite stressful on a
new Directory impl...
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jan 23, 2014 at 8:40 PM, Scott Schneider
wrote:
> Thanks! I ran this Directory subclass through the Lucene unit tests (and
> found 3 race conditi
Hi Scott,
the unit tests are also a good performance test. But to compare your directory
with another one, be sure to:
- use a defined directory instance to compare. The most performant Lucene one
is: -Dtests.directory=MMapDirectory - so compare you results with that one. If
you don't define a
Hi all,
We have over 6 million documents in our index, and would like to construct a
term frequency matrix over all 6 million documents as quickly as possible.
Each document has a numeric date field, so we would like to build a time series
which contains values which are the sum of all frequen
Hi!
I believe the approach below can help you.
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java
Marcio
http://numere.stela.org.br
Go beyond Luceneā¢ features with NumereĀ®
2014/1/24 Witdouck, Xavier
> Hi all,
>
> We have over 6 m
Hello
While searching a query, I guess that Lucene traverses a
Field->Term->DocId structure, filters the docIds that satisfy the query,
score them and then sort them
Given a resulting docId, I would like a way to find at least a valid
path (or the first valid path or all valid paths) that ma
Hello.
I would like to serialize a query into a string (A) and then to
unserialize it back into a query (B)
I guess that a solution is
A) query.toString()
B) StandardQueryParser().parse(query,"")
It is suboptimal for me though, because my app already has a custom
query parser (with leadingWi
Hey Vishnu, I'm trying to understand what you're trying to accomplish
(cc'ing Lucene user group to solicit additional advice)
Are you trying to extract all the terms for a given document? If so, you
might just want to enable term vectors to analyze the index terms for the
document.
-Doug
On Fri
I see that SnapshotDeletionPolicy no longer supports snapshotting by an
app-supplied string id, as of Lucene 4.4. However, my use case relies on
the policy's ability to maintain multiple snapshots simultaneously to
provide index versioning semantics, of sorts. What is the new recommended
way of doi
First of all, query.toString is not idempotent. You cannot count on feeding
the results of query.toString back into query and getting the same
thing, so that's out.
Not quite sure what the right solution is though
Best,
Erick
On Fri, Jan 24, 2014 at 11:29 AM, Olivier Binda
wrote:
> Hello.
>
It added complexity, for Lucene to track the app-provided ID. And,
it's something you can easily add back on top of the new API, if
necessary.
But, maintaining multiple snapshots is certainly still allowed:
multiple snapshots referencing the same IndexCommit is fine. There is
a ref count increme
Hello,
I searched a lot about lucene limits and its performance, but I still don't
know how much I can count on it. I'm storing logs and indexing them with
lucene. The event per second is 2000. The format of each log is generally
'fieldname' : 'fieldvalue'.
What search performance should I expect
12 matches
Mail list logo