date:20050421

Re: sorting on "dates" a little fuzzy...

2005-04-21 Thread Che Dong

James åé: Hi Che- The presort method was our first approach but this doesn't work in practice because we update the index incrementally and insertion order doesn't match date ordering as we add updates. I don't think sorting top hits only will deliver what the user is expecting -- that is, results

Re: sorting on "dates" a little fuzzy...

2005-04-21 Thread James

Hi Che- The presort method was our first approach but this doesn't work in practice because we update the index incrementally and insertion order doesn't match date ordering as we add updates. I don't think sorting top hits only will deliver what the user is expecting -- that is, results listed

Re: sorting on "dates" a little fuzzy...

2005-04-21 Thread James

Hi Erik, Thanks for the reply. All dateTime fields are zero-padded and the same length, and each indexed document has a valid dateTime value. Regarding the sort type, INT generates a ParseException, I assume because the string has too many digits to fit in an int. I looked for a LONG type but

Re: sorting on "dates" a little fuzzy...

2005-04-21 Thread Che Dong

Just like Google said: full text search service is not traditional database application. Lucene is not a database too: if you wanna sort on some fields, you'd better pre-sort it before it indexed: like date. then get results by doc id. For lucene you can only sort results in top hits. if you so

Increase IndexWriter.mergeFactor if you have enought memory Re: Lucene bulk indexing

2005-04-21 Thread Che Dong

Hi all: did you tried to increase IndexWriter.mergeFactor. I tried to increase it to 1000 and index speed is about 10 time faster than defualt = 10 . Regards Che Dong http://www.chedong.com/ Aalap Parikh åé: My machine is pretty good and fairly new. The disk for sure is not slow and also I am not

Re: token type question

2005-04-21 Thread ethandev

Thanks Pierrick. Are you say that I should construct Token in analyzer like new Token ("chem_H2O", 100, 103, "chem"); note that chem_ is added prefix to H2O, and 100 to 103 is length of H2O rather than chem_H2O? I also have some further problem and not sure if can be solved by this approch. I

Fwd: [jira] Closed: (INFRA-272) 3 new Lucene mailing lists

2005-04-21 Thread Erik Hatcher

Sorry for the delay in sending this out. There are now new lists for Lucene commit messages, one for the Ruby port work that is beginning, and also a general one set up to span all of the Lucene community for use for general discussion across all subprojects. Here are quick links for subscrib

Re: Lucene bulk indexing

2005-04-21 Thread Chris Hostetter

: the app using JProfiler and found out that 90% of time : is spent in the IndexWriter.addDocument call. As what analyzer are you using? : My machine: Pentium 4 CPU 2.40 GHz : RAM 1 GB what JVM args are you using? (in particular: how much ram are you telling the JVM to use) ... what

Re: sorting on "dates" a little fuzzy...

2005-04-21 Thread Erik Hatcher

On Apr 21, 2005, at 5:22 PM, James Levine wrote: I have an index of around 3 million records, and typical queries can result in result sets of between 1 and 400,000 results. We have indexed "dateTime" fields in the form 20050415142, that is, to 10-minute precision. When I try to sort queries I get

sorting on "dates" a little fuzzy...

2005-04-21 Thread James Levine

I have an index of around 3 million records, and typical queries can result in result sets of between 1 and 400,000 results. We have indexed "dateTime" fields in the form 20050415142, that is, to 10-minute precision. When I try to sort queries I get something back that is roughly sorted on index

help with date sort, please

2005-04-21 Thread James

Apologies if the post is a duplicate, but my original post didn't come back over the mailing list... I have an index of around 3 million records, and typical queries can result in result sets of between 1 and 400,000 results. We have indexed "dateTime" fields in the form 20050415142, that is,

Re: Lucene bulk indexing

2005-04-21 Thread Aalap Parikh

Hi, Thanks for your suggestion. I haven't yet tried your technique but I did try something similar by tweaking some Indexwriter properties like mergeFactor and minMergeDocs and it did certainly speed up the process a lot. I am sure the same can be achieved with what you suggest because it is essen

Re: Lucene bulk indexing

2005-04-21 Thread Aalap Parikh

My machine is pretty good and fairly new. The disk for sure is not slow and also I am not indexing large Documents; 27 fields with each field value being a string with no more than 15-20 characters long. I tried setting the maxFieldLength value of the Indexwriter to a low value but that didn't hel

Re: WildCard search replacement

2005-04-21 Thread Aalap Parikh

Hi, Thanks for your reply. One more question. You mentioned that your technique can be used for wildcard search like ex. *123* . But say I only need something like 123* i.e. wildcard only at the end and NOT on both sides, then how can one use your technique to avoid TooManyClauseException? Thanks

Re: extract data from mpg/avi etc

2005-04-21 Thread Hasan Diwan

On 21/04/05, Peter Veentjer - Anchor Men <[EMAIL PROTECTED]> wrote: > Does anyone know of a library that can extra metadata from movie > formats? http://computing.ee.ethz.ch/sepp/jmf-1.0-to.html That's advertised to be able to. -- Cheers, Hasan Diwan <[EMAIL PROTECTED]> -

Re: Lucene and J2EE transactions

2005-04-21 Thread Joseph B. Ottinger

Well, LuceneRAR isn't transactional - yet. As soon as I figure out how to queue deletes, though... :) On Thu, 21 Apr 2005, Erik Hatcher wrote: On Apr 21, 2005, at 9:43 AM, Peter Gelderbloem wrote: Hi, I am looking to get Lucene to participate in a JTA transaction. What would be the best way to do

Re: Lucene and J2EE transactions

2005-04-21 Thread Erik Hatcher

On Apr 21, 2005, at 9:43 AM, Peter Gelderbloem wrote: Hi, I am looking to get Lucene to participate in a JTA transaction. What would be the best way to do this? Have a look at LuceneRAR: https://lucenerar.dev.java.net/ I have no experience with it, but it fits what you're looking for. I am thinkin

Lucene and J2EE transactions

2005-04-21 Thread Peter Gelderbloem

Hi, I am looking to get Lucene to participate in a JTA transaction. What would be the best way to do this? I am thinking maybe use a message queue that feeds an indexing thread/message driven bean with add update and delete information. Or maybe using a subclass of Directory that uses a relational

Can not create searcher: java.io.IOException: Invalid argument

2005-04-21 Thread Mariella Digiacomo

Hi ALL, We have built Lucene indexes on a Solaris box. We have tested them and they can be accessed OK when residing on a native Linux filesystem. What we like to do is export through NFS the Lucene indexes from the Solaris box to the Linux box (mainly for development and testing purposes). When

extract data from mpg/avi etc

2005-04-21 Thread Peter Veentjer - Anchor Men

Does anyone know of a library that can extra metadata from movie formats? Met vriendelijke groet, Peter Veentjer Anchor Men Interactive Solutions - duidelijk in zakelijke internetoplossingen Praediniussingel 41 9711 AE Groningen T: 050-3115222 F: 050-5891696 E: [EMAIL PROTECTED] I : www.anch

Re: Lucene bulk indexing

2005-04-21 Thread Peter A. Daly

On some systems I have seen big speed increases by indexing to a RAMDirectory and periodically "merging" into an on disk directory every X number of docs. May or may not help in this case. In the first case a used this, it took indexing down from a few hours to 30 minutes for a few million docume

Re: fields that are indexed as UnStored

2005-04-21 Thread Andrzej Bialecki

Chuck Williams wrote: Omar Didi writes (4/20/2005 5:05 PM): Hi guys, If a field is indexed as UnStored how can I get it value? I tried document.get("UnStored_field") it returns null. You didn't store it, so it's not there. If the field happens to be a single Term, you might be able to find it

Re: Best way to purposely corrupt an index?

2005-04-21 Thread Andy Roberts

On Wednesday 20 Apr 2005 12:52, Kevin L. Cobb wrote: > My policy on this type of exception handling is to only byte off what > you can chew. If you catch an IOException, then you simply report to the > user that an unexpected error has occurred and the search engine is > unobtainable at the moment.

Re: sorting on "dates" a little fuzzy...

Re: sorting on "dates" a little fuzzy...

Re: sorting on "dates" a little fuzzy...

Re: sorting on "dates" a little fuzzy...

Increase IndexWriter.mergeFactor if you have enought memory Re: Lucene bulk indexing

Re: token type question

Fwd: [jira] Closed: (INFRA-272) 3 new Lucene mailing lists

Re: Lucene bulk indexing

Re: sorting on "dates" a little fuzzy...

sorting on "dates" a little fuzzy...

help with date sort, please

Re: Lucene bulk indexing

Re: Lucene bulk indexing

Re: WildCard search replacement

Re: extract data from mpg/avi etc

Re: Lucene and J2EE transactions

Re: Lucene and J2EE transactions

Lucene and J2EE transactions

Can not create searcher: java.io.IOException: Invalid argument

extract data from mpg/avi etc

Re: Lucene bulk indexing

Re: fields that are indexed as UnStored

Re: Best way to purposely corrupt an index?

23 matches

Site Navigation

Mail list logo

Footer information