Hey, Anshum
u mean Indexwriter based on RAMdirectory must be a singleton/static, yeah,
that works, finally success[?][?][?][?][?],
thanks a lot!
Regards
Hilly
2010/8/17 Anshum
> Hi Hilly,
> Seems like you are trying to use an already closed writer. Could you keep
> the writer open and continu
Thanks for the info Glen.
~Ramneek
On Sun, Aug 15, 2010 at 9:18 AM, Glen Newton wrote:
> Lucene has been used - usually as a starting base that has been
> modified for specific tasks - by a number of IR researchers for
> various TREC challenges. Here are some (there are many more):
>
> IBM Haif
Hi Nik,
Inline below.
On Aug 15, 2010, at 5:01 PM, Nik Kolev wrote:
> Hi,
>
> I am researching the possibility of using Lucene for discovering
> clusters of documents and since I am new to Lucene I decided to
> ask the community for advice before I poke the APIs and the internals.
> Your input w
Hi Otis,
Thank you for the notice. I'll do so.
"What happened with Lucandra?" Is really a hard question to answer.
After testing the CassandraDirectory and Lucandra against a real-time stream
of "large" data. I've concluded that the approach to make this data
searchable in Lucene over Cassandra
Thanks Grant. I'll take a look at Solr's faceting.
A colleague of mine also discovered solr's clustering component -
http://wiki.apache.org/solr/ClusteringComponent. It's still labeled as
experimental - does anybody have experience with it?
Another option (pointed out by your post:
http://www.luc
Hi Erick,
Here's some more details about our structure. First here's an example of
document in our index :
PrimaryKey= SJAsfsf353JHGada66GH6 (it's a hash)
DocType = X
Data = This is the data
SearchableContent = This is the data
DateCreated
Hello,
the evening after FrOSCon - that is on August 22nd 2010 at 7:30p.m. CEST - a
combined "FSFE Fellowship meetup/ Apache dinner*" takes place in Tigges in
Düsseldorf (Brunnenstraße 1, at Bilker S-Bahnhof). Given it doesn't rain, we'll
be sitting outside.
Would be great to meet you there f
I think it's similar to datagrid directory plugins which are likely
higher performance than Cassandra but still have performance issues
with large indexes.
Sent from my iPhone
On Aug 17, 2010, at 12:20 PM, Utku Can Topçu wrote:
> Hi Otis,
>
> Thank you for the notice. I'll do so.
>
> "What happ
If you have tens of millions of documents, almost all with unique fields
that you're sorting on, you'll chew through memory like there's no
tomorrow.
Have you looked at trie fields? See:
http://www.lucidimagination.com/blog/2009/05/13/exploring-lucene-and-solrs-trierange-capabilities/
I'm a littl
Would our approach to limit the search top 250 documents (and then sort
these 250 documents) work fine ? Or even 250 unique terms with a lot of
users is bad on memory when sorting ?
We didn't look at trie fields - I will do though, thanks for the tip !
We do store the original 'Data' field (only
Hmmm, I glossed over your comment about sorting the top 250. There's
no reason that wouldn't work.
Well, one way for, say, dates is to store separate fields. , MM, DD,
HH, MM, SS, MS. That gives you say, 100 year terms, + 12 month
+31 days + for a very small total. You pay the price thoug
I could at least drop hours/mins/sec, we don't need them, so my timestamp
could become 'MMDD', that would cut the number of unique terms at least
for dates.
What about my other question about numbers : *" We do pad our numbers with
zeros though (for example: 10 becomes 0010, etc.) because
Using NumericField for dates and other numbers is likely to help a
lot, and removes padding problems. I'd try that first, or just sort
the top n hits yourself.
--
Ian.
On Tue, Aug 17, 2010 at 8:46 PM, Michel Nadeau wrote:
> I could at least drop hours/mins/sec, we don't need them, so my times
I am trying to have multi-word synonyms work in lucene using Solr's *
SynonymFilter*.
I need to match synonyms at index time, since many of the synonym lists are
huge. Actually they are really not synonyms, but are words that belong to a
concept. For example, I would like to map {"New York", "Los
14 matches
Mail list logo