Re: Searching doubt

2009-08-04 Thread darren
eaningful token. Input text: "The President of the United States lives in the White House" Tokens: "The" "President of the United States" "lives" "in" "the" "White House" Term: "President" Result: "President of a Com

Re: Searching doubt

2009-08-04 Thread darren
A, ok. Interesting problem there as well. I'll think on that one some too! cheers. > Hi Darren, > > The question was, how given a string "aboutus" in a document, you can > return > that document as a result to the query "about us" (note the space

Which will be faster?

2008-04-15 Thread darren
Hi, Pardon the noob question. But which approach is going to be faster over extremely large document sets. A or B? A) Multiple field values, Stored.NO,TOKENIZED. word: one word: two word: three B) Single field value, Stored.NO,TOKENIZED word: one two three Thanks for the tip. Darren

Re: multi-term synonym expansion

2010-07-06 Thread darren
How does the synonym filter work internally? I configured it with a very large synonym file (90,000 lines) running Solr in glassfish and it started fine, but when I queried, it hung and ran out of memory. The file wasn' big enough to exhaust the heapI never was able to get it to run smoothly.

Re: Simple search question

2010-11-02 Thread darren
Couldn't one write a custom filter that modified the inbound term semantics before doing the search? Then, wildcard behavior can be added to terms without doing query string splicing. > You might take a look at Ngrams. These can be used to find partial > matches without resorting to wildcards, alt

Re: Search within a sentence (revisited)

2011-07-20 Thread darren
I just parse the text into sentences and put those in a multi-valued field and then search that. On Wed, 20 Jul 2011 11:27:38 -0400, Peter Keegan wrote: > I have browsed many suggestions on how to implement 'search within a > sentence', but all seem to have drawbacks. For example, from > http:/

java.nio.channels.ClosedChannelException and java.nio.channels.ClosedByInterruptException in Lucene 3.6.2

2016-06-13 Thread Darren Kennedy
Hi, We switched from MMAP to NIOFS due to high memory usage. Now seeing java.nio.channels.ClosedChannelException and java.nio.channels.ClosedByInterruptException during search. Stack traces: Exception details: IQQG0020E java.io.IOException: null: NIOFSIndexInput (path="/opt/css-store/Collections

end of line in queries

2005-05-11 Thread Govoni, Darren
Hi, I'm trying to perform a query and ened to specify a string pattern occurring at the end of a line. Is this possible? Thanks. Darren

RE: indexing relational table(s)

2005-05-11 Thread Govoni, Darren
You can also leverage the 'fields' capability in lucene and perhaps match them against columns to do field-based searching. -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Wed 5/11/2005 12:50 PM To: java-user@lucene.apache.org Subject: Re: indexing relational ta

InstantiatedIndex help

2008-11-16 Thread Darren Govoni
query it like before. What's the proper order to do this? Also, if anyone has any empirical data on the performance or reliability of InstantiatedIndex, I'd be curious. Thanks for the tips! Darren - To unsubscri

Re: InstantiatedIndex help

2008-11-16 Thread Darren Govoni
er) ireader = iindex.indexReaderFactory() isearcher = IndexSearcher(ireader) Kind of round about way to get an InstantiatedIndex I guess,but maybe there's a briefer way? Thank you. Darren On Sun, 2008-11-16 at 10:50 -0500, Mark Miller wrote: > Check out the docs at: > http://lucene.apache.

Re: InstantiatedIndex help

2008-11-16 Thread Darren Govoni
Yeah. That makes sense. Its not too hard to wrap those extra steps so I can end up with something simpler too. Like: iindex = InstantiatedIndex("path/to/my/index") I'm lazy so the intermediate hoops to jump through clutter my code. Hehe. :) Darren On Sun, 2008-11-16 at 11

Re: InstantiatedIndex help + first impression

2008-11-16 Thread Darren Govoni
t its graph and getting the expected speed? thanks to anyone who can verify this. On Sun, 2008-11-16 at 12:37 -0500, Darren Govoni wrote: > Yeah. That makes sense. Its not too hard to wrap those extra steps so I > can end up with something simpler too. Like: > > iindex = Instanti

# of fields, performance

2008-12-02 Thread Darren Govoni
mance characteristics with a high number of fields and is anyone using indexes this way? thank you for any thoughts. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

How to search for "-2" in field?

2008-12-11 Thread Darren Govoni
uot;\-2 Word" and it still doesn't work. I've used all the analyzers. What's the trick here? Thanks, Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to search for "-2" in field?

2008-12-11 Thread Darren Govoni
er > you're looking for. > > Cheers > Rob > > On Thu, Dec 11, 2008 at 3:59 PM, Darren Govoni <[EMAIL PROTECTED]> wrote: > > > Hi, > > This might be a dumb question, but I have a simple field like this > > > > field: 0 -2 Word > > > &

Re: How to search for "-2" in field?

2008-12-11 Thread Darren Govoni
ionFile" = NO (thought this one would work). Same results for the other analyzers more or less. Weird. Darren On Thu, 2008-12-11 at 23:02 +0530, prabin meitei wrote: > Hi, While constructing the query give the query string in quotes. > eg: query = queryparser.parse("\"-2 wo

Re: How to search for "-2" in field?

2008-12-12 Thread Darren Govoni
t; toostep.com > > On Thu, Dec 11, 2008 at 11:28 PM, Darren Govoni wrote: > > > I'm using Luke to find the right combination of quotes,\'s and > > analyzers. > > > > No combination can produce a positive result for "-2 String" for the >

Re: How to search for "-2" in field?

2008-12-12 Thread Darren Govoni
Hi Matt, Thanks for the thought. Yeah, I see it there in Luke, but the other gentleman's idea that maybe Luke is producing different than code might be a clue. It would be odd, if true, but nothing else works so I will see if that is it. Darren On Fri, 2008-12-12 at 08:03 -0500, Matthew

RE: Why indexing database is necessary? (RE: indexing database)

2008-03-04 Thread Darren Hartford
Indexing with lucene/nutch on top of/instead of DB indexing for: 1) relativity scoring 2) alias searching (i.e. a large amount of aliases, like first names) 3) highlighting 4) cross-datasource searching (multi DB, DB + XML files, etc). As for best approach to externally index, I do not have any d

word position operator?

2008-03-16 Thread Darren Govoni
Hi, I want to do a query such as word: first* where I want 'first' to be the start of the string value contained in the word field and not somewhere inside it. What's the best way to do this? thanks for any tips, Darren

Re: PhraseQuery little bug?

2008-04-03 Thread Darren Govoni
One interpretation of the query with ~5 is that your text has 5 words and ~5 would imply a word in any position can match. Could it be this? - Original Message - From: "Ivan Vasilev" <[EMAIL PROTECTED]> To: "LUCENE MAIL LIST" Sent: Thursday, April 03, 2008 6:03 AM Subject: PhraseQuery

Re: Which will be faster?

2008-04-15 Thread Darren Govoni
I guess I meant searching the index, size of index etc. So they would search essentially the same? Sorry that wasn't clear from my original email. Darren - Original Message - From: "Erick Erickson" <[EMAIL PROTECTED]> To: Sent: Tuesday, April 15, 2008 1:15

possible to read index into memory?

2008-06-26 Thread Darren Govoni
Hi, Is there a lucene index reader that will load a disk-based index into memory and perform searches on it from RAM? Sorry if I missed this in the docs somewhere. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Read index into RAM?

2008-06-27 Thread Darren Govoni
Hi, Is it possible to read a disk-based index into RAM (entirely) and have all searches operate on it there? I saw some RAMDirectory examples, but it didn't look like it will transfer a disk index into RAM. thanks D - To unsu

Boost token when storing document?

2008-07-13 Thread Darren Govoni
one" was present 3 times. This way I can manipulate the presence of tokens in a document without having to waste space for them? Thank you for any thought on this. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Strict Ordering of Boosted results?

2008-07-26 Thread Darren Govoni
rmB^2.0 I want ALL termA results (ordered by score) to come before ANY termB results (also ordered by score). Is there a way to do this in the query syntax? Or is this simple multiple queries? thank you, Darren - To unsubscribe,

Re: possible to read index into memory?

2008-08-12 Thread Darren Govoni
new RAMDirectory instance from a different > Directoryimplementation. This can be used to load a disk-based index > into memory. > > Seems like exactly what you're asking for... > > Best > Erick > > On Thu, Jun 26, 2008 at 3:40 PM, Darren Govoni <[EMAIL PROTE

Re: possible to read index into memory?

2008-08-13 Thread Darren Govoni
7;s a very very long time all things considered. I understand about the OS paging and such but in doing some variations of this to "throw the OS off", I still saw no difference between on-disk and RAM times. But despite that, the times are really slow. Any ideas? thanks again, Darren On

Re: possible to read index into memory?

2008-08-13 Thread Darren Govoni
too long for a simple query as this. Do those figures sound right for Lucene doing this kind of single field match? Darren On Wed, 2008-08-13 at 10:24 -0400, Erick Erickson wrote: > How are you measuring? There is a bunch of setup work for the first > few queries that go through the system.

Get id of Document just added?

2008-08-16 Thread Darren Govoni
Hi, I combed through the API and some of the mailing list. I need to get the id of a Document just added. How should this be done? I'm using Lucene 2.3.2. thank you, Darren - To unsubscribe, e-mail: [EMAIL PROTECTED

Re: Get id of Document just added?

2008-08-16 Thread Darren Govoni
Yeah, you are right. Was looking for a lazy way to avoid writing 5 lines of code. Hehe. Thanks, Darren On Sat, 2008-08-16 at 10:44 -0400, Mark Miller wrote: > Darren Govoni wrote: > > Hi, > > I combed through the API and some of the mailing list. I need > > to get the

lucene 3.0 feature list?

2008-08-26 Thread Darren Govoni
Hi, Sorry if I missed this somewhere or maybe its not released yet, but I was anxiously curious about lucene 3.0's expected features/improvements. Is there a list yet? thanks! Darren - To unsubscribe, e-mail: [

Re: lucene 3.0 feature list?

2008-08-27 Thread Darren Govoni
:59 PM, Karl Wettin <[EMAIL PROTECTED]> wrote: > > > > > 27 aug 2008 kl. 00.52 skrev Darren Govoni: > > > > Hi, > >> Sorry if I missed this somewhere or maybe its not released yet, but I > >> was anxiously curious about lucene 3.0's expected fe

Indexing Scalability, Multiwriter?

2008-10-10 Thread Darren Govoni
threads? thanks for any tips! You guys rock. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Indexing Scalability, Multiwriter?

2008-10-11 Thread Darren Govoni
Glen, Thank you for the details there. Its really great what you've done and I will study it some more! I too though about using multiple writers into separate indexes and then combining them into one and optimizing, but haven't tried it yet. Darren On Fri, 2008-10-10 at 22:17 -

Re: Lucene 2.4.0 release

2008-10-11 Thread Darren Govoni
Congratulations! A truly stellar achievement. Can't wait to dive in! On Sat, 2008-10-11 at 11:50 -0400, Michael McCandless wrote: > Release 2.4.0 of Lucene is now available! > > With 2.4.0 we have relaxed the backwards compatibility policy of the > Fieldable interface: we now allow changes on

Link map over results? or term freq

2008-10-16 Thread Darren Govoni
? thank you for any help. I will keep reading/looking. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Link map over results? or term freq

2008-10-16 Thread Darren Govoni
in the results they got back. Sort of like latent relationships. Does that help? I thought this could be done using term frequency vectors in Lucene, but I've never used TFV's before. And can then be limited to just a set of results. HTH, Darren On Thu, 2008-10-16 at 14:09 -0400, G

Re: Link map over results? or term freq

2008-10-16 Thread Darren Govoni
does, but even with it, the clusters are more discrete than a tag cloud which has "shades of gray". Darren On Thu, 2008-10-16 at 17:39 -0400, Glen Newton wrote: > See also: > http://zzzoot.blogspot.com/2007/10/drill-clouds-for-search-refinement-id.html > and > http://zzzoo

Re: Link map over results? or term freq

2008-10-16 Thread Darren Govoni
rious if Lucene made this easier with information built into the Document objects (which would be logical to me). Darren On Thu, 2008-10-16 at 17:37 -0400, Glen Newton wrote: > Yes, tag clouds. > > I've implemented them using Lucene here for NRC Research Press articles: > http:

Re: instantiated index in 2.4

2008-10-27 Thread Darren Govoni
Has anyone gotten some initial performance observations about instantiated index? I replaced my RAMDirectory searcher with one and it was slower or about the same. The note about it claims 100x possible performance improvement. Maybe there is a data size beyond which its performance excels. thank

RE: Lucene as a primary datastore

2010-01-20 Thread Darren Hartford
My two cents is no, not to use lucene as a primary datastore. Although there are some datastores that look similar to lucene who define themselves as primary datastores (the 'nosql' style datastores), I would put lucene besides the likes of RRD and other specifically purposed information stores th

RE: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Darren Hartford
If you are going to end up either copying or moving all the data to lucene (which, when you hook up lucene even to the existing mysql data, it will still create it's own copy of the data), you might really want to look at other options: *column oriented databases (analytical databases). If ope

Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-06-29 Thread Darren Hartford
Hey all, As you can tell by the subject, interested in 'name searching' and 'nearby name' searching. Scenarios include Geneology and Similar-Person-from-Different-Datasources matchings. Assuming java-based lucene, and more than likely the Solr project. *nickname: would it be feasible to create

RE: Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-07-02 Thread Darren Hartford
Thank you for the link to the previous thread, lot of information there! *Synonym use of nicknames - that sounds quite feasible. Do you specifically mean the WordNet module in the Sandbox, or something different? > -Original Message- > From: Grant Ingersoll [mailto:[EMAIL PROTECTED] >

RE: Solr newbe

2007-07-26 Thread Darren Hartford
One side-note is various content management tools already handle a lot of data extraction (POI/PDFBox/etc). In the case of Jakarta Slide and Apache Jackrabbit, both use Lucene under the covers to index this data. Not sure if you want to take the approach of putting your documents as 'managed' und

Highlighting search words in full document

2013-04-06 Thread Darren Hoffman
best approach to accomplish this. I am also currently with Lucene 3.6 but am looking to upgrade to 4.2. Thanks in advance. Darren Hoffman

Re: Highlighting search words in full document

2013-04-07 Thread Darren Hoffman
the highlighted version. > > Best > Erick > > On Sat, Apr 6, 2013 at 11:57 PM, Darren Hoffman wrote: >> I am creating a Bible search app that indexes each verse of the bible as a >> separate document. When a user selects a verse from search results, I am

Re: Highlighting search words in full document

2013-04-08 Thread Darren Hoffman
Thanks, Erick. I'll try that. Darren On 2013-04-07 3:25 PM, "Erick Erickson" wrote: >Well, at that point you have a doc ID presumably. When you format your >responses to the initial query, the link you provide for each verse is >something like > >yourse

Re: How to get hits coordinates in Lucene 4.4.0

2013-09-06 Thread Darren Hoffman
using IntelliJ to build the APK file using the discrete lucence library jars. Thanks, Darren On 8/12/13 1:02 AM, "Lingviston" wrote: >Hi, I'm trying to use Lucene in my Android project. To start with I've >created a small demo app. It works with .txt files but I need to w

Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
trying to upgrade to 4.4 but IntelliJ does not currently support SPI services. Does 4.4 offer substantial performance improvements that I should take the time to upgrade and work around the IntelliJ shortfall? Thanks, Darren

Re: Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
27;d say so. The CHANGES.txt is where >I'd >look to see if anything mentioned is worth your time. > >Not to mention SolrCloud... > >Erick > > >On Fri, Sep 6, 2013 at 3:41 PM, Darren Hoffman wrote: > >> I am using the SmartChineseAnalyzer in v3.6 but accessing o

Re: 答复: Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
in >memory used for identical data, so I'd say so. The CHANGES.txt is where >I'd >look to see if anything mentioned is worth your time. > >Not to mention SolrCloud... > >Erick > > >On Fri, Sep 6, 2013 at 3:41 PM, Darren Hoffman wrote: > >> I am using t

Natural Sort Order

2013-10-14 Thread Darren Hoffman
that does not return results in "natural order" has much larger documents even thought the number of documents is about the same magnitude. I am currently using version 3.6. Thanks in advance, Darren

Re: Lucene 4.0 Index Format Finalization Timetable

2011-12-06 Thread Darren Govoni
I asked here[1] and it said "Ask again later." [1] http://8ball.tridelphia.net/ On 12/06/2011 08:46 PM, Jamie Johnson wrote: Thanks Robert. Is there a timetable for that? I'm trying to gauge whether it is appropriate to push for my organization to move to the current lucene 4.0 implementation

LUCENE-4713

2013-03-12 Thread Darren Hoffman
structor that accepts a codec. The exception is being thrown when I try to instantiate IndexWriterConfig. Thank you, Darren Hoffman